-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Description
We observe that jobs submitted via our front-end wrapper script sometimes end with failed 100 : assumedly after job, and the qmaster message file reports it cannot read the usage file for the job. Importantly, the job is never executed on the execution host.
qmaster message
2025-12-11 16:21:27.352581| worker|03|rdocs01|W|job 142486 .1 failed on host rd2696 assumedly after job because: can't read usage file for job 142486 .1
Client-side messages
waiting for interactive job to be scheduled ...
Your interactive job 142486 has been successfully scheduled.
Establishing builtin session to host rd2696 ...
Your job 142486 ("test_job") has been submitted
qacct -j 142486
start_time -/-
end_time -/-
granted_pe NONE
slots 1
failed 100 : assumedly after job
exit_status 0
ru_wallclock 0
ru_utime 0.000
ru_stime 0.000
ru_maxrss 0
ru_ixrss 0
ru_ismrss 0
ru_idrss 0
ru_isrss 0
ru_minflt 0
ru_majflt 0
ru_nswap 0
ru_inblock 0
ru_oublock 0
ru_msgsnd 0
ru_msgrcv 0
ru_nsignals 0
ru_nvcsw 0
ru_nivcsw 0
wallclock 0.000
cpu 0.000
mem 0.000
io 0.000
iow 0.000
maxvmem 0
maxrss 0
arid undefined
Environment
-
Product: OCS 9.0.9 (build 141125-1311) — official binaries
-
OS/Distro: Oracle Linux 8.10
-
Front-end: wrapper submit script (
run_job) that asks for yes/no confirmation before invokingqrshand related tools
Observations
-
The job never starts on the execution host (no start_time, no resource usage).
-
Accounting shows zero usage and
failed=100. -
qmaster logs indicate missing usage file, suggesting the shepherd never wrote it.
-
The issue occurs only when confirmation is piped (e.g.,
echo y | run_jobory | run_job). -
When typing
yinteractively, the job runs normally.
Steps to Reproduce
-
Use the submit wrapper script that prompts for confirmation.
-
Pipe
yinto the script:echo y | run_job # or y | run_job
-
Observe that the job is scheduled but never executed, and accounting shows
failed 100.
Control Case (works)
- When typing
ymanually at the prompt (interactive stdin), the job runs normally and accounting is written.
Thanks in advance!