FAQ

This is a list of frequently asked questions. As a logged-in user, you can comment on this page. We welcome any suggestions for new FAQs or improvements!

Why are the partitions limited to only 7 days?

Imagine, all cluster nodes are busy executing other user's jobs, they are running for 30 days and you want to start your own jobs ASAP. Or, as an administrator, you have to shut down the cluster for maintenance due to some hardware problem and you have to wait for the longest job. Those use cases do not work well with very long execution times. So for the sake of fair usage, you are restricted to 7 days (which is already quite a lot for a typical cluster system) so that other users can use the rare resources as well. If you have a job that has to run for more than 7 days uninterrupted, please contact us at hpc@uni-muenster.de and we will find a solution. Often, specific restart mechanisms are implemented into simulation software or can be scripted by the users themselves.

When to contact support?

Do not hesitate to contact us when you need help with topics like:

Accessing the cluster
Software installations
Questions regarding already installed software
Software optimizations
How to start your jobs

Please contact us, if you need help in those areas, but be aware that we might not be able to help you in any case. Especially some software packages are not suited to be installed on an HPC cluster.

I am confused by all this stuff, where do I start on the cluster?

We are always eager to help you, but it really helps if you read the introduction and the getting started section in advance. We are working on a walk-through to make it easier to find the right path if you are new.

My jobs have to wait when I submit them to the batch system, can you buy more nodes?

Short answer: No.

An HPC cluster is a large investment for the university and is typically maintained for about 5-10 years until it becomes obsolete. PALMA-II was installed in 2018 and is funded by the DFG, who is interested in the complete utilization of the available capacity. This means it is a desired state that big jobs have to wait at least a while when being submitted because it is more economical to have no spare resources.

How do I find out how much memory (RAM) I have to reserve for my jobs?

We are currently working on a tutorial to give you a good way to do this, but as a first try you could do the following:

Start your job on a node with about 80GB of RAM reserved
If the job fails with some OOM (Out of Memory) message, this was not enough
If it crashes, try to reserve 180GB
Have a look with "squeue -u your_username" where it runs
Inspect this node in Ganglia and have a look at the "free memory" metric to see, how much RAM was acutally used.
In the next run, try to reserve the amount you saw in the last step plus about 10%
If this works, you are done. If not, increase it by some fair amount.

How can I use the hardware most efficient?

Know if your software can run in parallel (MPI, OpenMP, GPU).
Find out how your software scales with increased usage of resources. More is not always faster!
Estimate how long your calculation will run and specify it in your job script. The less time you request, the faster your job will actually start.
Do not request full nodes if you only need a few cores.

My code runs too slow, what can I do?

The first answer to this question shouldn't be "reserve more hardware", since you first have to figure out, if your code is capable of using multiple cores/nodes.

If nothing else helps, you can apply for a bigger cluster at another computing site like RWTH Aachen to get resources there.

Can you explain the toolchain concept?

Toolchains are so called "meta-modules" loading a specific set of software into your environment. A toolchain is comprised of a compiler, an MPI stack and numerical libraries at a specific version. Examples are listed in the table below. For more information have a look at the module system.

	Toolchain	Compiler	MPI stack	Numerical libraries
2018	foss/2018a	GCC/6.4.0	OpenMPI/2.1.2	OpenBLAS/0.2.20, ScaLAPACK/2.0.2, FFTW/3.3.7
2018	intel/2018a	icc, ifort /2018.1.163	impi/2018.1.163	imkl/2018.1.168
2019	foss/2019a	GCC/8.2.0	OpenMPI/3.1.3	OpenBLAS/0.3.5, ScaLAPACK/2.0.2, FFTW/3.3.8
2019	intel/2019a	icc, ifort /2019.1.144	impi/2018.4.274	imkl/2019.1.144

Can you install numpy and scipy?

This is already done. Due to optimization reasons, they are not part of the normal Python module, but there is a separate module named "SciPy-bundle" which can be seen, if you load the foss or intel toolchain (see last question). If you load this module, some additional Python modules become available.

How can I start multiple copies of a shell script in parallel?

Have a look at GNU parallel.

Is the software on all partitions identical?

No, it is not. But we try our best to make it so. PALMA is a very heterogeneous system considering the different types of hardware. CPU architectures include skylake, broadwell, zen2 and zen3. There are nodes with different GPUs and also different interconnects (Omnipath and Infiniband). To provide optimized software for all hardware we have to install software multiple times. If you do not find a program in a specific partition, please tell us and we will install it.

Can I see, how much in terms of future priority my jobs "cost"?

With "sacct --format="Partition,User,AllocTres%50" -u your_account" you get a list of your currently running jobs and their billing. This is a measure for the amount, the priority of your future jobs will be lowered.

My job submission fails with "sbatch: error: Batch job submission failed: Invalid account or account/partition combination specified"

This typically happens if your membership in u0clstr expired and you had to renew it. For unknown reasons, it is necessary to add the SLURM directive "-A uni" or "--account=uni" to your Slurm script or the command line to make the jobs eligible again. This odd behaviour usually disappears after one day.

Why are the partitions limited to only 7 days?

Imagine, all cluster nodes are busy executing other user's jobs, they are running for 30 days and you want to start your own jobs ASAP. Or, as an administrator, you have to shut down the cluster for maintenance due to some hardware problem and you have to wait for the longest job. Those use cases do not work well with very long execution times. So for the sake of fair usage, you are restricted to 7 days (which is already quite a lot for a typical cluster system) so that other users can use the rare resources as well. If you have a job that has to run for more than 7 days uninterrupted, please contact us at hpc@uni-muenster.de and we will find a solution. Often, specific restart mechanisms are implemented into simulation software or can be scripted by the users themselves.

When to contact support?

Do not hesitate to contact us when you need help with topics like:

Accessing the cluster
Software installations
Questions regarding already installed software
Software optimizations
How to start your jobs

Please contact us, if you need help in those areas, but be aware that we might not be able to help you in any case. Especially some software packages are not suited to be installed on an HPC cluster.

I am confused by all this stuff, where do I start on the cluster?

We are always eager to help you, but it really helps if you read the introduction and the getting started section in advance. We are working on a walk-through to make it easier to find the right path if you are new.

My jobs have to wait when I submit them to the batch system, can you buy more nodes?

Short answer: No.

An HPC cluster is a large investment for the university and is typically maintained for about 5-10 years until it becomes obsolete. PALMA-II was installed in 2018 and is funded by the DFG, who is interested in the complete utilization of the available capacity. This means it is a desired state that big jobs have to wait at least a while when being submitted because it is more economical to have no spare resources.

How do I find out how much memory (RAM) I have to reserve for my jobs?

We are currently working on a tutorial to give you a good way to do this, but as a first try you could do the following:

Start your job on a node with about 80GB of RAM reserved
If the job fails with some OOM (Out of Memory) message, this was not enough
If it crashes, try to reserve 180GB
Have a look with "squeue -u your_username" where it runs
Inspect this node in Ganglia and have a look at the "free memory" metric to see, how much RAM was acutally used.
In the next run, try to reserve the amount you saw in the last step plus about 10%
If this works, you are done. If not, increase it by some fair amount.

What are the plans for the future? Is there a roadmap?

We have ordered a bunch of new, watercooled systems. There will be Genoa X based CPU only systems, some with larger amounts of RAM (like 4 TB) as well as systems equipped with Nvidia's RTX 4090 GPUs.

The filesystem will be renewed with an all flash system in early 2024.

How can I use the hardware most efficient?

Know if your software can run in parallel (MPI, OpenMP, GPU).
Find out how your software scales with increased usage of resources. More it not always faster!
Estimate how long your calculation will run and specify it in your job script. The less time you request, the faster your job will actually start.
Do not request full nodes if you only need a few cores.

My code runs too slow, what can I do?

The first answer to this question shouldn't be "reserve more hardware", since you first have to figure out, if your code is capable of using multiple cores/nodes.

If nothing else helps, you can apply for a bigger cluster at another computing site like RWTH Aachen to get resources there.

Can you explain the toolchain concept?

Toolchains are so called "meta-modules" loading a specific set of software into your environment. A toolchain is comprised of a compiler, an MPI stack and numerical libraries at a specific version. Examples are listed in the table below. For more information have a look at the module system.

	Toolchain	Compiler	MPI stack	Numerical libraries
2019	foss/2019a	GCC/8.2.0	OpenMPI/3.1.3	OpenBLAS/0.3.5, ScaLAPACK/2.0.2, FFTW/3.3.8
2019	intel/2019a	icc, ifort /2019.1.144	impi/2018.4.274	imkl/2019.1.144

Can you install numpy and scipy?

This is already done. Due to optimization reasons, they are not part of the normal Python module, but there is a separate module named "SciPy-bundle" which can be seen, if you load the foss or intel toolchain (see last question). If you load this module, some additional Python modules become available.

How can I start multiple copies of a shell script in parallel?

Have a look at GNU parallel.

Is the software on all partitions identical?

No, it is not. But we try our best to make it so. PALMA is a very heterogeneous system considering the different types of hardware. CPU architectures include skylake, broadwell, ivy- and sandybridge. There are nodes with different GPUs and also different interconnects (Omnipath and Infiniband). To provide optimized software for all hardware and hardware combinations we have to install software multiple times. If you do not find a program in a specific partition, please tell us and we will install it.

Can I see, how much in terms of future priority my jobs "cost"?

With "sacct --format="Partition,User,AllocTres%50" -u your_account" you get a list of your currently running jobs and their billing. This is a measure for the amount, the priority of your future jobs will be lowered.

My job submission fails with "sbatch: error: Batch job submission failed: Invalid account or account/partition combination specified"

This typically happens if your membership in u0clstr expired and you had to renew it. For unknown reasons, it is necessary to add the SLURM directive "-A uni" or "--account=uni" to your Slurm script or the command line to make the jobs eligible again. This odd behaviour disappears after a while.

My job fails due to a UCX Error under Open MPI

UCX  ERROR mm ep failed to connect to remote FIFO id 0xc000000900009e1b: Shared memory error

This is a recently observed issue on the zen3 nodes when UCX is used under Open MPI. We are still investigating what causes this error. Until then, you can try to overcome the error by disabling UCX under Open MPI during your job:

mpirun ... --mca pml ^ucx ...

This is a list of frequently asked questions. As a logged-in user, you can comment on this page. We welcome any suggestions for new FAQs or improvements!

Bereichsverknüpfungen

Seitenhierarchie

Why are the partitions limited to only 7 days?

When to contact support?

I am confused by all this stuff, where do I start on the cluster?

My jobs have to wait when I submit them to the batch system, can you buy more nodes?

How do I find out how much memory (RAM) I have to reserve for my jobs?

How can I use the hardware most efficient?

My code runs too slow, what can I do?

Can you explain the toolchain concept?

Can you install numpy and scipy?

How can I start multiple copies of a shell script in parallel?

Is the software on all partitions identical?

Can I see, how much in terms of future priority my jobs "cost"?

My job submission fails with "sbatch: error: Batch job submission failed: Invalid account or account/partition combination specified"

Why are the partitions limited to only 7 days?

When to contact support?

I am confused by all this stuff, where do I start on the cluster?

My jobs have to wait when I submit them to the batch system, can you buy more nodes?

How do I find out how much memory (RAM) I have to reserve for my jobs?

What are the plans for the future? Is there a roadmap?

How can I use the hardware most efficient?

My code runs too slow, what can I do?

Can you explain the toolchain concept?

Can you install numpy and scipy?

How can I start multiple copies of a shell script in parallel?

Is the software on all partitions identical?

Can I see, how much in terms of future priority my jobs "cost"?

My job submission fails with "sbatch: error: Batch job submission failed: Invalid account or account/partition combination specified"

My job fails due to a UCX Error under Open MPI