Skip to content

Aravindh.net

Slurm

Slurm is an open source job scheduling system for Linux clusters often used in the HPC space.

Most of SLURM related help can be obtained from the excellent wiki here: https://wiki.fysik.dtu.dk/niflheim/SLURM

Here I note some tips/tricks/commands that I ended up discovering from my $dayjob as a sysadmin maanging a slurm cluster.

Run my job in a specific node in the cluster

scontrol update jobid=734116 qos=high partition=ghpc_v2 nodelist=sky006

Filter jobs and act on them

myst|grep xyguo|grep PD|grep ghpc_shor|awk '{print $1}'| xargs -I{} scontrol update jobid={} partition=ghpc_v2