How to run jobs with PBS/Pro on Cherry-creek

For specific information on how to run more complex (for example, multi-node MPI) jobs, see the following: run_mpi

In general, with the exception of the additional steps necessary to assign resources to your job, you would do the same steps inside of a batch job as you would working from an interactive command line.

The following is a quick step-by-step guide to getting started running jobs on cherry-creek.

A zip achive containing the files used in this example are available from: example-pbs.tar.

To run the example job, download and untar the example-pbs.tar inside of your cherry-creek account using the commands.

wget http://www.nscee.edu/documentation/cherrycreek/example-pbs.tar
tar xf example-pbs.tar
and then enter the following commands:

cd example-pbs
qsub test-r.pbs

The system should respond with something like:

[youruname@cherry-creek example-pbs]$ qsub test-r.pbs
79560.cherry
[youruname@cherry-creek example-pbs]$

The "79560.cherry" is the job id for your batch job, you can use it with the qstat command to see what is happenning:

[ron@cherry-creek gill-pbs]$ qstat -a 79560.cherry
qstat: 79560.cherry Job has finished, use -x or -H to obtain historical job information

This means that your job has finished, if you just do "qstat -a" then it will list all active jobs.

Contents of the example-pbs.zip archive

example-1.txt   a copy of the interactive commands used to test things by hand.
test.r the R script file.
test-r.pbs the pbs script to run R.

This example is based on the following web page: http://www.instantr.com/2013/02/12/performing-a-cluster-analysis-in-r

The test.r file looks like:

#
# this is where your R script code would go, the following is a
# copy of the commands tested in the previous example (just cut
# and pasted)
#

europe =  read.csv("europe.csv",header=TRUE)
europe
euroclust<-hclust(dist(europe[-1]))
pdf(file='outplot-pbs.pdf')
plot(euroclust, labels=europe$Country)
rect.hclust(euroclust, 5)
q()

The PBS script file (test-r.pbs) is here:

#!/bin/bash
#
#PBS -m be
#PBS -N NSIexample
#PBS -l ncpus=1,mem=30gb,cput=15:0 -l walltime=15:0

# show current directory is your home directory
pwd

# copy input files to PBS created scratchdir
cp example-pbs/test.r $TMPDIR
cp example-pbs/europe.csv $TMPDIR

# change to  PBS created scratch directory
cd $TMPDIR

# list current files in scratchdir
ls -l 

# setup environment for R
module load R intel intelmpi

# run our R script (in the file test.r)
R --save < test.r 

# list all of the files in the current directory
ls -l

# save output file outplot-pbs.pdf back to example directory
cp outplot-pbs.pdf ~/example-pbs

exit 0

The lines begining with #PBS set pbs options for the job. The first one

#PBS -m be
tells pbs to send email to user when job begins execution (b) and when job terminates (e). The next line
#PBS -N NSIexample
gives the job a name. The line
#PBS -l ncpus=1,mem=30gb,cput=15:0 -l walltime=15:0
sets the number of cpu's to 1 memory limit to 30 gb and max cpu time to 15 minutes and walltime limit to 15 minutes.

The remaining lines are explained by the comments. After the job runs, the files NSIexample.e{jobnum} and NSIexample.o{jobnum} and outplot-pbs.pdf should appear. The file NSIexample.e{jobnum} if created will have any stderr output produced. The file NSIexample.o{jobnum} will have any stdout output produced. The file outplot-pbs.pdf is the output R was told to create.

You will want to look at the stderr and stdout files for your job to see how the job ran (or if there were problems).

Compiling an MPI C program

We want to run an MPI job that uses a total of 64 processes (cores). We also want to limit to 8 the number of processes running on each node (this allows us the flexibility of controlling how the system allocates the compute cores so we can have OPENMP threads or other special needs taken into account.

To compile a simple "hello world" mpi program (after logging into cherry-creek):


  module add intel                # activate the Intel compiler suite
  module add intelmpi             # activate the Intel MPI runtime

  # make a copy of the sample hello world program
  cp /share/apps/intel/impi_latest/test/test.c test.c
  mpicc test.c -o testc           # compile the sample program

Create a file called testc.pbs with the following (starting in column 1):


  #!/bin/bash
  #PBS -l select=8:ncpus=8:mpiprocs=8:mem=60Gb
  #PBS -N test-job  
  #PBS -l cput=640:0                # limit job to 640 cpu minutes
  #PBS -l walltime=10:0             # cput estimate is np*walltime minutes
  
  cd $PBS_O_WORKDIR
  module add intel intelmpi
  
  # change allocated nodes over to use the infiniband interconnect
  
  NODEFILE="$PBS_JOBNAME-$PBS_JOBID.nodes"
  cp $PBS_NODEFILE $NODEFILE
  sed -i 's/.local/.ibnet/g' $NODEFILE
  
  echo
  echo -n "==== job started: "
  date
  echo
  
  echo "=== limits ==="
  echo
  ulimit -s unlimited
  ulimit -a
  echo
  echo "=== environment ==="
  echo
  printenv
  echo
  echo "=== nodes ==="
  echo
  cat $NODEFILE
  echo
  
  /usr/bin/time mpirun -np 64 -hostfile $NODEFILE ./testc
  rm $NODEFILE
  echo
  echo -n "=== job finished: "
  date
  echo
  exit 0

The lines starting with #PBS tells the job scheduler how to allocate resources and other options for your job. The options of particular interest are:

select=# allocate # separate nodes
ncpus=# on each node allocate # cpus (cores)
mpiprocs=# on each node allocate # cpus (of the ncpus allocated) to MPI
mem=# on each node allocate # (kb, mb, gb) of memory.

Cherry-creek has three types of compute nodes:

To submit the test job:

qub testc.pbs