UNLV National Supercomputing Institute

NSI Home About Us Our Affiliates Resources Documentation Current Research Current Issues New Account Reset Password Request Help Contact Us

How to run jobs with PBS/Pro on Cherry-creek

For specific information on how to run more complex (for example, multi-node MPI) jobs, see the following: run_mpi

In general, with the exception of the additional steps necessary to assign resources to your job, you would do the same steps inside of a batch job as you would working from an interactive command line.

The following is a quick step-by-step guide to getting started running jobs on cherry-creek.

A zip achive containing the files used in this example are available from: example-pbs.zip.

To run the example job, download and unzip the example-pbs.zip inside of your cherry-creek account and then enter the following commands:

cd example-pbs
qsub -q small test-r.pbs

The system should respond with something like:

[ron@cherry-creek example-pbs]$ qsub test-r.pbs
79560.cherry
[ron@cherry-creek example-pbs]$

The "79560.cherry" is the job id for your batch job, you can use it with the qstat command to see what is happenning:

[ron@cherry-creek gill-pbs]$ qstat -a 79560.cherry
qstat: 79560.cherry Job has finished, use -x or -H to obtain historical job information

This means that your job has finished, if you just do "qstat -a" then it will list all active jobs.

Contents of the example-pbs.zip archive

example-1.txt	a copy of the interactive commands used to test things by hand.
test.r	the R script file.
test-r.pbs	the pbs script to run R.

This example is based on the following web page: http://www.instantr.com/2013/02/12/performing-a-cluster-analysis-in-r

The test.r file looks like:

#
# this is where your R script code would go, the following is a
# copy of the commands tested in the previous example (just cut
# and pasted)
#

europe =  read.csv("europe.csv",header=TRUE)
europe
euroclust<-hclust(dist(europe[-1]))
pdf(file='outplot-pbs.pdf')
plot(euroclust, labels=europe$Country)
rect.hclust(euroclust, 5)
q()

The PBS script file (test-r.pbs) is here:

#!/bin/bash
#
#PBS -l ncpus=1,mem=30gb,nmics=1,cput=1:0:0 -l walltime=1:0:0
cd $PBS_O_WORKDIR                  # change to our (current) directory
module load R intel intelmpi
export MKL_MIC_ENABLE=1
export MIC_OMP_NUM_THREADS=240
export OFFLOAD_REPORT=2
if [ ! -f europe.csv ]; then
  #
  # if our europe data set does not exists, download it...
  #
  wget --quiet http://www.instantr.com/wp-content/uploads/2013/01/europe.csv
fi
#
# run our R script (in the file test.r)
#
R --save < test.r

#
# list all of the files in the current directory
#

ls -l
exit 0

The first three lines tells PBS to use the standard linux shell and set the job limits (1 cpu, 30gb mem, 1 PHI coprocessor, 15 minutes of cpu & walltime).

The next line changes to the current directory where the job was submitted from. The directory name is stored in $PBS_O_WORKDIR (this is set by the qsub command).

The next four lines (module .. export OFFLOAD_REPORT=2) configures the R package to use the cherry-creek PHI co-processors.

The if ... fi statement will download the R dataset if it hasn't been already.

Then we run the test.r script file with R.

The last two lines will list all of the file names in the work directory.

After the job runs, the contents of the example-pbs directory should look something like:

[ron@cherry-creek example-pbs]$ ls -l
total 44
-rw-rw-r-- 1 ron ron 1410 Jan  8  2013 europe.csv
-rw-rw-r-- 1 ron ron 4731 May  1 11:30 example-1.txt
-rw-rw-r-- 1 ron ron 8211 May  1 11:47 outplot-pbs.pdf
-rw-rw-r-- 1 ron ron  895 May  1 11:23 test.r
-rw-rw-r-- 1 ron ron  562 May  1 11:27 test-r.pbs
-rw------- 1 ron ron   39 May  1 11:47 test-r.pbs.e79560
-rw------- 1 ron ron 4447 May  1 11:47 test-r.pbs.o79560

europe.csv	is the downloaded dataset
outplot-pbs.pdf	is the output plot in pdf format
test-r.pbs.e79560	is the stderr (error log) for the job
test-r.pbs.o79560	is the stdout (output log) for the job.

You will want to look at the stderr and stdout files for your job to see how the job ran (or if there were problems).

Compiling an MPI C program

We want to run an MPI job that uses a total of 64 processes (cores). We also want to limit to 8 the number of processes running on each node (this allows us the flexibility of controlling how the system allocates the compute cores so we can have OPENMP threads or other special needs taken into account.

To compile a simple "hello world" mpi program (after logging into cherry-creek):


  module add intel                # activate the Intel compiler suite
  module add intelmpi             # activate the Intel MPI runtime

  # make a copy of the sample hello world program
  cp /share/apps/intel/impi_latest/test/test.c test.c
  mpicc test.c -o testc           # compile the sample program

Create a file called testc.pbs with the following (starting in column 1):


  #!/bin/bash
  #PBS -l select=8:ncpus=8:mpiprocs=8:nmics=0:mem=60Gb
  #PBS -N test-job  
  #PBS -l cput=640:0                # limit job to 640 cpu minutes
  #PBS -l walltime=10:0             # cput estimate is np*walltime minutes
  
  cd $PBS_O_WORKDIR
  module add intel intelmpi
  
  # change allocated nodes over to use the infiniband interconnect
  
  NODEFILE="$PBS_JOBNAME-$PBS_JOBID.nodes"
  cp $PBS_NODEFILE $NODEFILE
  sed -i 's/.local/.ibnet/g' $NODEFILE
  
  echo
  echo -n "==== job started: "
  date
  echo
  
  echo "=== limits ==="
  echo
  ulimit -s unlimited
  ulimit -a
  echo
  echo "=== environment ==="
  echo
  printenv
  echo
  echo "=== nodes ==="
  echo
  cat $NODEFILE
  echo
  
  /usr/bin/time mpirun -np 64 -hostfile $NODEFILE ./testc
  rm $NODEFILE
  echo
  echo -n "=== job finished: "
  date
  echo
  exit 0

The lines starting with #PBS tells the job scheduler how to allocate resources and other options for your job. The options of particular interest are:

select=#	allocate # separate nodes
ncpus=#	on each node allocate # cpus (cores)
mpiprocs=#	on each node allocate # cpus (of the ncpus allocated) to MPI
nmics=#	on each node is the number of Intel Xeon PHI coprocessor cards to allocate.
mem=#	on each node allocate # (kb, mb, gb) of memory.

Cherry-creek has three types of compute nodes:

To submit the test job:

qub testc.pbs