Monitoring Slurm Cluster Jobs

To check whether a Slurm cluster is ready or to get information on batch jobs that have been submitted to a Slurm cluster, you can issue Slurm commands at a command prompt on the Slurm Controller machine. For an autoscaling cluster, the Slurm Controller runs on the cluster head node. See Connecting to an Autoscaling Cluster Head Node. For standard HPC clusters, the Slurm Controller may be on a separate virtual desktop. See Connecting to a Linux Virtual Machine Using SSH.

The two commands used to get cluster and job information, sinfo and squeue, are described below. For a full list of Slurm commands, go to https://slurm.schedmd.com/quickstart.html.