How to run complex jobs with PBS/Pro on Cherry-creek
For specific information on how to run more common (for example,
R analysis) jobs, see the following:
In general, with the exception of the additional steps necessary to assign
resources to your job, you would do the same steps inside of a batch job as
you would working from an interactive command line.
The following is a quick step-by-step guide to getting started running
mpi jobs on
cherry-creek.
In this example, we want to run an MPI job that will use a total of 64 processes
(cores). We also want to limit to 8 the number of processes running on each
node
(this will allow us the flexibility of controlling how the system allocates the compute
cores so we can have OPENMP threads or other special needs taken into account.
To compile a simple "hello world" mpi program (after logging into cherry-creek),
The following modules need to be activated:
module add intel # activate the Intel compiler suite
module add intelmpi # activate the Intel MPI runtime
We are going to use one of the Intel supplied test programs, make
a test directory and copy the
source file.
mkdir example-mpi
cd example-mpi
cp /share/apps/intel/impi_latest/test/test.c test.c
Now compile and link the program:
Create a file called testc.pbs with the following contents:
The lines starting with #PBS tells the job scheduler how to allocate
mpicc test.c -o testc # compile the sample program
(starting in column 1 - do not copy leading spaces)
#!/bin/bash
#PBS -l select=8:ncpus=8:mpiprocs=8:nmics=0:mem=60Gb
#PBS -N test-job
#PBS -l cput=10:0:0 # limit job to 10 cpu hours (approx 98% of walltime)
#PBS -l walltime=10:0:0
cd $PBS_O_WORKDIR
module add intel intelmpi
# change allocated nodes over to use the infiniband interconnect
NODEFILE="$PBS_JOBNAME-$PBS_JOBID.nodes"
cp $PBS_NODEFILE $NODEFILE
sed -i 's/.local/.ibnet/g' $NODEFILE
echo
echo -n "==== job started: "
date
echo
echo "=== limits ==="
echo
ulimit -s unlimited
ulimit -a
echo
echo "=== environment ==="
echo
printenv
echo
echo "=== nodes ==="
echo
cat $NODEFILE
echo
/usr/bin/time mpirun -np 64 -hostfile $NODEFILE ./testc
rm $NODEFILE
echo
echo -n "=== job finished: "
date
echo
exit 0
resources and other options for your job. The options of particular interest are:
select=#
allocate # separate nodes
ncpus=#
on each node allocate # cpus (cores) mpiprocs=#
on each node allocate # cpus (of the ncpus allocated) to MPI.
Normally, this is the same as ncpusnmics=#
on each node allocate # of Intel Xeon PHI coprocessor cards
(normally, 1 PHI is enough -- file a help request if you think
you need more).mem=#
on each node allocate # (kb, mb, gb) of memory.