Slurm Basics

Terminology

Account

A grouping attribute in Slurm for managing usage budgets and partition access for research teams. Ask your system administrator for the accounts you belong to.

Affinity

The mapping that defines how scheduled processes are allocated to compute resources, such as cores, sockets, nodes, and accelerator hardware.

Partition

A grouping of machines that have common queuing restrictions, such as wall clock and resource limits. Ask your system administrator for which partitions you are able to submit to.

Virtual core

On GCP, a virtual core is a single hyperthread that is able to execute instructions. For Intel CPUs on GCP, there are two hyperthreads per physical core.

Resource allocation and Job submission

salloc

https://slurm.schedmd.com/sallloc.html

Slurm command for allocating resources to execute job steps (see srun). Common use case is for interactive jobs, without graphics forwarding, usually for software development.

Example :

salloc --partition=PARTITION --account=ACCOUNT [--nodes=N --cores=C --memory=MEM_IN_BYTES] --walltime=HH:MM:SS

To release your resource allocation, use scancel.

To execute commands on allocated resources, use srun.

Usage charges are incurred for every second a resource allocation is active.

srun

https://slurm.schedmd.com/srun.html

Slurm command for executing a job step on allocated resources. This command will create an allocation if one does not exist. Best practice is to specify your task affinity in the command options. Common use cases include single command execution from command line or in batch scripts and interactive workflows with graphics forwarding.

Example

srun [--ntasks=N --ntasks_per_node=M --memory=MEM_IN_BYTES] COMMAND

Interactive job with graphics forwarding

srun [--ntasks=N --ntasks_per_node=M --memory=MEM_IN_BYTES] --x11 --pty /bin/bash

sbatch

https://slurm.schedmd.com/sbatch.html

Slurm command for executing batch workloads. Common use case includes execution of "set-and-forget" serial and parallel workloads. This command expects an execution script, most often with a "batch header" that specifies resource and time requirements in addition to job execution commands.

Example

sbatch --partition=PARTITION --account=ACCOUNT BATCH_SCRIPT

scancel

https://slurm.schedmd.com/scancel.html

Slurm command for canceling jobs and job steps and releasing resource allocations.

Example :

scancel JOB-ID

Monitoring jobs and resources

squeue

https://slurm.schedmd.com/squeu.html

Slurm command for checking the status of jobs. When executed without options, all jobs are reported.

Example

squeue --user=USERNAME --partition=PARTITION

sinfo

https://slurm.schedmd.com/sinfo.html

Slurm command for checking the status of partitions and compute nodes.

Submitting batch jobs

Batch jobs are useful for workflows where you have an application that can run unattended for long periods of time. To submit a batch job, you must first create a shell script (e.g. bash), called a "batch script", that contains the commands you wish to execute. At the top of the batch script, it’s best practice to include a batch header that describes to Slurm the resources you wish to use. The batch header consists of lines with the file preceded with ‘#SBATCH’. For example, you can specify how many tasks per node to use, where to log output, how many CPU’s to use, how many GPU’s to use, etc. Once the batch script has been created (for this next example we used ‘example.batch’ as our batch file) it can be executed by running the following command:

$ sbatch example.batch

Below is a simple example batch file:

#!/bin/bash
#SBATCH --job-name=example_job_name   # Job name
#SBATCH --ntasks=1                    # Run on a single CPU
#SBATCH --ntasks-per-node=1
#SBATCH --partition=this-partition
#SBATCH --gres=gpu:2                  # Request two GPUs
#SBATCH --time=00:05:00               # Time limit hrs:min:sec
#SBATCH --output=serial_test_%j.log   # Standard output and error log

hostname

The above batch file has multiple constraints that dictate how the job will be executed.

  • --job-name=’name’ sets the job name
  • --ntasks=1 advises the slurm controller that job steps run within the allocation will launch a maximum of 1 tasks
  • --ntasks-per-node=1 When used by itself, this constraint requests that 1 task per node be invoked. When used with --ntasks, --ntasks-per-node is treated as the maximum count of tasks per node.
  • --partition=this-partition requests the job execute on a partition called "this-partition"
  • --gres=gpu:2 indicates that 2 GPUs are requested to execute this batch job
  • --time=00:05:00 sets a total run time of 5 minutes for job allocation.
  • --output=name.log out creates a file containing the batch script’s stdout and stderr.

SchedMD’s sbatch documentation provides a more complete description of the sbatch command line interface and the available options for specifying resource requirements and task affinity.

Once you have submitted your batch job, you can check the status of your job with the `squeue` command.

$ squeue --username=<username>

The `--username=` flag is optional and allows you to filter out only your jobs.

Interactive Workflows

The interactive workflows described here use a combination of salloc and srun command line interfaces. It is highly recommended that you read through SchedMD's salloc documentation and srun documentation to understand how to reserve and release compute resources in addition to specifying task affinity and other resource-task bindings.

For all interactive workflows, you should be aware that you are charged for each second of allocated compute resources. It is best practice to set a wall-time when allocating resources. This practice helps avoid situations where you will be billed for idle resources you have reserved.

Allocate and Execute Workflow

With slurm, you can allocate compute resources that are reserved for your sole use. This is done using the salloc command. As an example, you can reserve exclusive access on one compute node for an hour on the default partition

$ salloc --time=1:00:00 --N1 --exclusive

Once resources are allocated, Slurm responds with a job id. From here, you can execute commands on compute resources using `srun`. srun is a command line interface for executing “job steps” in slurm. You can specify how much of the allocated compute resources to use for each job step. For example, the srun command below executes provides access to 4 cores for executing ./my-application.

$ srun -n4 ./my-application

It is highly recommended that you familiarize yourself with Slurm’s salloc and srun command line tools so that you can make efficient use of your compute resources.

To release your allocation before the requested wall-time, you can use scancel

$ scancel <job-id>

After cancelling your job, or after the wall-clock limit is exceeded, Slurm will automatically delete compute nodes for you.

Interactive Shell Workflow (with Graphics Forwarding)

If your workflow requires graphics forwarding from compute resources, you can allocate resources as before using salloc, e.g.,

$ salloc --time=1:00:00 --N1 --exclusive

Once resources are allocated, you can launch a shell on the compute resources with X11 forwarding enabled.

$ srun -N1 --pty --x11 /bin/bash

Once you are complete with your work, exit the shell and release your resources.

$ exit
$ scancel <job-id>