The batch or job scheduling system on PALMA-II is called SLURM. If you are used to PBS/Maui and want to switch to SLURM, this document might help you. The job scheduler is used to start and manage computations on the cluster but also to distribute resources among all users depending on their needs. Computation jobs (but also interactive sessions) can be submitted to different queues (or partitions in the slurm language), which have different purposes:
Partitions
Available for everyone:
Name | Purpose | CPU Arch | # Nodes | # GPUs | Compute capability of GPU | max. CPUs (threads) / node | max. Mem / node | max. Walltime | BeeOND storage |
---|---|---|---|---|---|---|---|---|---|
normal | general computations | Skylake (Gold 6140) | 143 160 | - | - | 36 | 92 GB 192 GB | 24 hours | 350 GB |
long | general computations | Skylake | - | - | 36 | 92 GB 192 GB | 7 days | 350 GB | |
express | short running (test) jobs compilation | Skylake (Gold 6140) | 5 | - | - | 36 | 92 GB | 2 hours | 350 GB |
bigsmp | SMP | Skylake (Gold 6140) | 3 | - | - | 72 | 1.5 TB | 7 days | 350 GB |
largesmp | SMP | Skylake (Gold 6140) | 2 | - | - | 72 | 3 TB | 7 days | 350 GB |
requeue* | This queue will use the free nodes from | Skylake (Gold 6140) | 68 50 3 | - | - | 36 36 72 | 92 GB 192 GB 1.5 TB | 1 day | 350 GB |
gpuv100 | Nvidia V100 GPUs | Skylake (Gold 6140) | 1 | 4 | 7.0 | 24 | 192 GB | 7 days | 930 GB |
vis-gpu | Nvidia Titan XP | Skylake (Gold 6140) | 1 | 8 | 6.1 | 24 | 192 GB | 2 days | -- |
vis | Visualization / GUIs | Skylake (Gold 6140) | 1 | - | - | 36 | 92 GB | 2 hours | -- |
broadwell | Legacy Broadwell CPUs | Broadwell | 44 | - | - | 32 | 118 GB | 7 days | 168 GB |
zen2-128C-496G | SMP | Zen2 (EPYC 7742) | 12 | - | - | 128 | 496 GB | 7 days | 1,8 TB |
gpu2080 | GeForce RTX 2080 Ti | Zen3 | 5 | 8 | 7.5 | 32 | 240 GB | 7 days | 930 GB |
gpuexpress | GeForce RTX 2080 Ti | Zen3 | 1 | 8 | 7.5 | 32 | 240 GB | 2 hours | 930 GB |
gputitanrtx | Nvidia Titan RTX | Zen3 | 1 | 4 | 7.5 | 32 | 240 GB | 7 days | 1,4 TB |
gpu3090 | GeForce RTX 3090 | Zen3 | 2 | 8 | 8.6 | 48 | 240 GB | 7 days | -- |
gpua100 | Nvidia A100 | Zen3 | 5 | 4 | 8.0 | 32 | 240 GB | 7 days | 930 GB |
gpuhgx | Nvidia A100 SXM 80GB | Zen3 | 2 | 8 | 8.0 | 64 | 990 GB | 7 days | 7 TB |
gpuexpress
You can allocate a maximum of 1 Job with 2 GPUs, 8 CPU cores and 60G of RAM on this node.
requeue*
If your jobs are running on one of the requeue nodes while they are requested by one of the exclusive group partitions, your job will be terminated and resubmitted, so use with care!
Group exclusive:
Name | # Nodes | max. CPUs (threads) / node | max. Mem / node | max. Walltime |
---|---|---|---|---|
p0fuchs | 9 | 36 | 92 GB | 7 days |
p0kulesz | 6 3 | 36 | 92 GB 192 GB | 7 days |
p0kapp | 1 | 36 | 92 GB | 7 days |
p0klasen | 1 1 | 36 | 92 GB 192 GB | 7 days |
hims | 25 1 | 36 | 92 GB 192 GB | 7 days |
d0ow | 1 | 36 | 92 GB | 7 days |
q0heuer | 15 | 36 | 92 GB | 7 days |
e0mi | 2 | 36 | 192 GB | 7 days |
e0bm | 1 | 36 | 192 GB | 7 days |
p0rohlfi | 7 8 | 36 | 92 GB 192 GB | 7 days |
SFB858 | 3 | 72 | 1.5 TB | 21 days |