How to run jobs with PBS/Pro on Cherry-creek
For specific information on how to run more complex (for example, multi-node MPI) jobs, see the following: run_mpi
In general, with the exception of the additional steps necessary to assign resources to your job, you would do the same steps inside of a batch job as you would working from an interactive command line.
The following is a quick step-by-step guide to getting started running jobs on cherry-creek.
A zip achive containing the files used in this example are available from: example-pbs.zip.
To run the example job, download and unzip the example-pbs.zip inside of your cherry-creek account and then enter the following commands:
cd example-pbs
qsub -q small test-r.pbs
The system should respond with something like:
[ron@cherry-creek example-pbs]$ qsub test-r.pbs
79560.cherry
[ron@cherry-creek example-pbs]$
The "79560.cherry" is the job id for your batch job, you can use it with the qstat command to see what is happenning:
[ron@cherry-creek gill-pbs]$ qstat -a 79560.cherry
qstat: 79560.cherry Job has finished, use -x or -H to obtain historical job information
This means that your job has finished, if you just do "qstat -a" then it will list all active jobs.
Contents of the example-pbs.zip archive
example-1.txt | a copy of the interactive commands used to test things by hand. |
test.r | the R script file. |
test-r.pbs | the pbs script to run R. |
This example is based on the following web page: http://www.instantr.com/2013/02/12/performing-a-cluster-analysis-in-r
The test.r file looks like:
#
# this is where your R script code would go, the following is a
# copy of the commands tested in the previous example (just cut
# and pasted)
#
europe = read.csv("europe.csv",header=TRUE)
europe
euroclust<-hclust(dist(europe[-1]))
pdf(file='outplot-pbs.pdf')
plot(euroclust, labels=europe$Country)
rect.hclust(euroclust, 5)
q()
The PBS script file (test-r.pbs) is here:
#!/bin/bash
#
#PBS -l ncpus=1,mem=30gb,nmics=1,cput=1:0:0 -l walltime=1:0:0
cd $PBS_O_WORKDIR # change to our (current) directory
module load R intel intelmpi
export MKL_MIC_ENABLE=1
export MIC_OMP_NUM_THREADS=240
export OFFLOAD_REPORT=2
if [ ! -f europe.csv ]; then
#
# if our europe data set does not exists, download it...
#
wget --quiet http://www.instantr.com/wp-content/uploads/2013/01/europe.csv
fi
#
# run our R script (in the file test.r)
#
R --save < test.r
#
# list all of the files in the current directory
#
ls -l
exit 0
The first three lines tells PBS to use the standard linux shell and set the job limits (1 cpu, 30gb mem, 1 PHI coprocessor, 15 minutes of cpu & walltime).
The next line changes to the current directory where the job was submitted from. The directory name is stored in $PBS_O_WORKDIR (this is set by the qsub command).
The next four lines (module .. export OFFLOAD_REPORT=2) configures the R package to use the cherry-creek PHI co-processors.
The if ... fi statement will download the R dataset if it hasn't been already.
Then we run the test.r script file with R.
The last two lines will list all of the file names in the work directory.
After the job runs, the contents of the example-pbs directory should look something like:
[ron@cherry-creek example-pbs]$ ls -l
total 44
-rw-rw-r-- 1 ron ron 1410 Jan 8 2013 europe.csv
-rw-rw-r-- 1 ron ron 4731 May 1 11:30 example-1.txt
-rw-rw-r-- 1 ron ron 8211 May 1 11:47 outplot-pbs.pdf
-rw-rw-r-- 1 ron ron 895 May 1 11:23 test.r
-rw-rw-r-- 1 ron ron 562 May 1 11:27 test-r.pbs
-rw------- 1 ron ron 39 May 1 11:47 test-r.pbs.e79560
-rw------- 1 ron ron 4447 May 1 11:47 test-r.pbs.o79560
europe.csv | is the downloaded dataset |
outplot-pbs.pdf | is the output plot in pdf format |
test-r.pbs.e79560 | is the stderr (error log) for the job |
test-r.pbs.o79560 | is the stdout (output log) for the job. |
You will want to look at the stderr and stdout files for your job to see how the job ran (or if there were problems).
Compiling an MPI C program
We want to run an MPI job that uses a total of 64 processes (cores). We also want to limit to 8 the number of processes running on each node (this allows us the flexibility of controlling how the system allocates the compute cores so we can have OPENMP threads or other special needs taken into account.
To compile a simple "hello world" mpi program (after logging into cherry-creek):
module add intel # activate the Intel compiler suite
module add intelmpi # activate the Intel MPI runtime
# make a copy of the sample hello world program
cp /share/apps/intel/impi_latest/test/test.c test.c
mpicc test.c -o testc # compile the sample program
Create a file called testc.pbs with the following (starting in column 1):
#!/bin/bash
#PBS -l select=8:ncpus=8:mpiprocs=8:nmics=0:mem=60Gb
#PBS -N test-job
#PBS -l cput=640:0 # limit job to 640 cpu minutes
#PBS -l walltime=10:0 # cput estimate is np*walltime minutes
cd $PBS_O_WORKDIR
module add intel intelmpi
# change allocated nodes over to use the infiniband interconnect
NODEFILE="$PBS_JOBNAME-$PBS_JOBID.nodes"
cp $PBS_NODEFILE $NODEFILE
sed -i 's/.local/.ibnet/g' $NODEFILE
echo
echo -n "==== job started: "
date
echo
echo "=== limits ==="
echo
ulimit -s unlimited
ulimit -a
echo
echo "=== environment ==="
echo
printenv
echo
echo "=== nodes ==="
echo
cat $NODEFILE
echo
/usr/bin/time mpirun -np 64 -hostfile $NODEFILE ./testc
rm $NODEFILE
echo
echo -n "=== job finished: "
date
echo
exit 0
The lines starting with #PBS tells the job scheduler how to allocate resources and other options for your job. The options of particular interest are:
select=# | allocate # separate nodes |
ncpus=# | on each node allocate # cpus (cores) |
mpiprocs=# | on each node allocate # cpus (of the ncpus allocated) to MPI |
nmics=# | on each node is the number of Intel Xeon PHI coprocessor cards to allocate. |
mem=# | on each node allocate # (kb, mb, gb) of memory. |
Cherry-creek has three types of compute nodes:
To submit the test job:
qub testc.pbs