How can I use SLURM's sacct command to show memory usage statistics for a job that I am running?



I want to find out how much memory my jobs are using on a cluster that uses the SLURM scheduler. When I run the sacct command, the output does not include information about memory usage. The man page for sacct, shows a long and somewhat confusing array of options, and it is hard to tell which one is best.

CURATOR: John Goodhue


ANSWER: This will do the trick: sacct --format="CPUTime,MaxRSS"


CLARIFICATION: Can you tell us more about what you want to learn from the memory


ANSWER: Here is an output if job ID is supplied:

login002 koleinik > sacct -o MaxRSS -j 129086


If jobID is not known, one can specify the date range:

login002 koleinik > sacct -S2017-01-01-00:00 -E2018-03-26-10:15  -o jobid,start,end,state,MaxRSS
       JobID               Start                 End      State     MaxRSS
------------ ------------------- ------------------- ---------- ----------
4901635      2017-11-11T13:09:40 2017-11-11T13:09:43     FAILED
4901635.bat+ 2017-11-11T13:09:40 2017-11-11T13:09:43     FAILED     33664K
4901636      2017-11-11T13:17:32 2017-11-11T13:17:35     FAILED
4901636.bat+ 2017-11-11T13:17:32 2017-11-11T13:17:35     FAILED     36628K
4901638      2017-11-11T13:54:33 2017-11-11T13:54:37     FAILED
4901638.bat+ 2017-11-11T13:54:33 2017-11-11T13:54:37     FAILED     36844K
4901639      2017-11-11T13:56:12 2017-11-11T13:58:13 CANCELLED+
4901639.bat+ 2017-11-11T13:56:12 2017-11-11T13:58:20  CANCELLED 133404464K


ANSWER: It’s useful to know that SLURM uses RSS (Resident set size) to indicate memory-related options. The man page lists four fields that one can specify with the “format” option that might be of use:

AveRSS – Average resident set size of all tasks in job
MaxRSS – Maximum resident set size of all tasks in job
MaxRSSNode – The node on which the maxrss occurred
MaxRSSTask – The task ID where the maxrss occurred

For example,

sacct --format="AveRSS,MaxRSS,MaxRSSNode"

Will display the Average and Maximum memory footprint for all tasks in your currently running jobs, and the nodes on which the maximum memory footprints occurred.


ANSWER: There’s a StackOverflow Q&A on this topic at: