Ask.Cyberinfrastructure

Pre-empting job termination by the scheduler

slurm
qow
#1

On slurm, when will the scheduler end a job, and how will I know this ahead of time? How can I catch the error code before my job gets killed?

#2

Hi jma,

For jobs that are reaching their TimeLimit, you have the option of using

--mail-type=TIME_LIMIT_50,TIME_LIMIT_80,TIME_LIMIT_90

to get a warning email when the job reaches the respective 50%, 80% or 90% of it’s TimeLimit.

For jobs being preempted where PreemptMode=CANCEL, the scheduler first sends SIGCONT and SIGTERM then later (depending on the configured GraceTime) sends SIGCONT, SIGTERM and SIGKILL.

You can react to these signals in your job script by using a trap, see https://bash.cyberciti.biz/guide/Trap_statement for a description and examples.