What are good uses for job arrays?

griznog · January 8, 2019, 9:13pm

An example which reads multiple parameters from a separate file is handy, e.g. in the job script:

#!/bin/bash
# An input file with a line for each array element
# and parameters separated by spaces.
PARAMS=/path/to/parameter/file.txt
read -ra params <<<"$(sed ${SLURM_ARRAY_TASK_ID}'q;d' $PARAMS)"

After which params will be an array with all the things from line ${SLURM_ARRAY_TASK_ID} in the $PARAMS file. By using an array it allows the number of parameters to change, so the next lines in the script might be

command -t ${params[0]} -b ${params[1]} ${params[@]:2}

to use the first two elements to set parameters and then pass everything else to the command.

Once a parameter file is ready (by whatever means it is created) the job can then be submitted with

sbatch --array=1-$(awk 'END{print NR}' /path/to/parameter/file.txt) job_array.sh

edit: Although I use the line read from the file as parameters here, there’s no reason why the entire command can’t be in the parameter file, so the array can be used really to run any arbitrary set of commands as the tasks.

Also when working on a cluster which limits the number of job array elements, using a step in the array spec and then having each array task run several commands for it’s step can be useful. Say with

--array=1-100:10

Then each element can work on lines ${SLURM_ARRAY_TASK_ID} through ${SLURM_ARRAY_TASK_ID} + 10 of the input file. Season to taste, of course.