Tips for allocating Job Resources
Cherry-creek currently two types of compute nodes:
cp |
48 "penguin" cherry-creek 2 (Penguin Computing Relion Servers) nodes (2 Intel Xeon E5-2640v3 8 core, 128Gb ram) |
ci |
24 "waterfall" cherry-creek 2 (Intel Servers) nodes (2 Intel Xeon E5-2697v2 12core, 192Gb ram) Please do not specifically request "ci" nodes unless your jobs need the additional memory. |
The "ci" nodes are typically very busy with long running jobs. If you
request that your
jobs run on them, they may wait a significant amount of time before the "ci"
modes will become available.
Example resource specifications
The following are commonly used resource allocation parameters:
walltime=# |
this is the TOTAL wall time that the job will be allowed to run (format: HH:MM:SS (hours, minutes, seconds)). There are limits placed on different job queues. The normal queue (workq) is limited to 744 hours (approx. 30 days), and the test queue (small) is limited to 15 minutes. |
select=# |
this tells PBS/Pro that the following settings are to be applied to this number of separate nodes (known as a "chunk"). |
ncpus=# | specifies the number of # cpu cores to be assigned to this chunk. |
mpiprocs=# |
specifies the number of # MPI processes assigned to this chunk. (Normally this is the same as the ncpus value). |
mem=# | specifies the amount of memory (in mb, gb, etc) to be assigned to this chunk. |
cput=# |
specifies the amount of cpu time for the chunk (format: HH:MM:SS (hours, minutes, seconds)). A reasonable value for cpu time is the number of mpiprocs*walltime. |
Qlist=string | Specifies the type of node to run on cp,ci are the current choices. |
For example, your MPI job requires 2 chunks of four cpu cores.
It also requires 30Gb of memory and you want it to run on a ci node.
The resource request would look like:
#PBS -q workq
#PBS -l walltime=24:00:00
#PBS -l select=2:ncpus=4:mpiprocs=4:mem=30gb:cput=96:00:00:Qlist=ci
Also, the system requires a minimum of 4Gb of ram for the operating system on each node,
so do not allocate more than 124Gb on the cp nodes, or 188Gb on the ci nodes.