Usually jobs on the cluster are started by submitting a job or batch script to one of the partitions (queues). Slurm will then take care of reserving the correct amount of resources and start the application on the reserved nodes. A job script, in the case of slurm, is a bash script and can be written locally (using your favorite plain text editor) or directly within the console on the cluster (using VIM, Emacs, nano ...). A typical example script named job.sh is given below:


Example Job Script
#!/bin/bash

#SBATCH --nodes=1					# the number of nodes you want to reserve
#SBATCH --ntasks-per-node=1 		# the number of tasks/processes per node
#SBATCH --cpus-per-task=36          # the number cpus per task
#SBATCH --partition=normal			# on which partition to submit the job
#SBATCH --time=24:00:00				# the max wallclock time (time limit your job will run)

#SBATCH --job-name=MyJob123			# the name of your job
#SBATCH --mail-type=ALL				# receive an email when your job starts, finishes normally or is aborted
#SBATCH --mail-user=your_account@uni-muenster.de # your mail address

# LOAD MODULES HERE IF REQUIRED
...
# START THE APPLICATION
...

The #!/bin/bash tells the script to use bash. #SBATCH  is a slurm directive and is used to configure slurm. Everywhere else the # sign is used to for comments.


You can submit your script to the batch system with the command: sbatch job.sh

MPI parallel Jobs

Start an MPI job with 72 MPI ranks distributed on 2 nodes for 1 hour on the normal partition. Instead of mpirun, the preferred command to start MPI jobs within slurm is srun.

MPI Job Script
#!/bin/bash

#SBATCH --nodes=2
#SBATCH --ntasks-per-node=36
#SBATCH --partition=normal
#SBATCH --time=01:00:00

#SBATCH --job-name=MyMPIJob123
#SBATCH --output=output.dat
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your_account@uni-muenster.de

# load needed modules
module load intel

# Previously needed for Intel MPI (as we do here) - not needed for OpenMPI
# export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so

# run the application
srun /path/to/my/mpi/program

Note that srun here is starting as many tasks as you requested with --ntasks-per-node. It is essentially a substitute for mpirun. Know what you are doing when your use it!

OpenMP parallel Jobs

Start a job on 36 CPUs with 1 threads each for 1 hour on the normal partition.

OpenMP Job Script
#!/bin/bash

#SBATCH --nodes=1
#SBATCH --cpus-per-task=36
#SBATCH --partition=normal
#SBATCH --time=01:00:00

#SBATCH --job-name=MyMPIJob123
#SBATCH --output=output.dat
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your_account@uni-muenster.de

# Bind each thread to one core
export OMP_PROC_BIND=TRUE
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

# load needed modules
module load intel

# run the application
/path/to/my/openmp/program

Hybrid MPI/OpenMP Jobs

Start a job on 2 nodes, 9 MPI tasks per node, 4 OpenMP threads per task.

Hybrid MPI/OpenMP Job Script
#!/bin/bash

#SBATCH --nodes=2
#SBATCH --ntasks-per-node=9
#SBATCH --partition=normal
#SBATCH --time=01:00:00

#SBATCH --job-name=MyMPIJob123
#SBATCH --output=output.dat
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your_account@uni-muenster.de

export OMP_NUM_THREADS=4

# load needed modules
module load intel

# run the application
srun /path/to/my/hybrid/program

Hybrid MPI/OpenMP/CUDA Jobs

Start a job on 2 nodes, 2 MPI tasks per node, 4 OpenMP threads per task, 2 GPUs:

Hybrid MPI/OpenMP Job Script
#!/bin/bash  

#SBATCH --partition=gpu2080
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=1
#SBATCH --gres=gpu:2
#SBATCH --job-name=MyMPIOpenMPCUDAJob
#SBATCH --output=output.dat
#SBATCH --error=error.dat 
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your_account@uni-muenster.de

export OMP_NUM_THREADS=4

# load needed modules module purge
ml palma/2022a
ml CUDA/11.7.0
ml foss/2022a
ml UCX-CUDA/1.12.1-CUDA-11.7.0
ml CMake/3.23.1

# Use UCX to be compatible with Nvidia (formerly Mellanox) Infiniband adapters  
export OMPI_MCA_pml=ucx

# run the application using mpirun in this case 
mpirun /path/to/my/hybrid/program

Interactive Jobs

You can request resources from SLURM and it will allocate an interactive shell to the user. On the login node type the following into your shell:

Interactive Session
salloc --nodes 1 --cpus-per-task 36 -t 00:30:00 --partition express

This will give you a session with 36 CPUs for 30 minutes on the express partition. You will automatically be forwarded to a compute node.

Flexible Submission Skript

If you want to change parameters in your script without actually editing it, you can use command line arguments overwriting the #SBATCH pragmas in the script:

Interactive Session
sbatch --cpus-per-task 16 submit_script.sh
  • No labels