The Batch System
- The batch system used on Discovery is Slurm.
- Users login, via ssh, to one of the submit nodes and submit jobs to be run on the compute nodes by writing a script file that describes their job.
- They submit the job to the cluster using the sbatch command.
There are four primary partitions on the cluster:
- standard – This is the main non-GPU queue for the cluster. It does not need to be specified as it is used by default.
- gpuq- This is a queue set up to run GPU related jobs on the two production GPU nodes. Open to all users.
- preemptable - A larger set of non-GPU nodes available to all users, but jobs from free account users have lower priority and may be requeued by a high priority account.
Users specify the amount of time and the number of processors required by their jobs. Several additional preemptable partitions exist for the newer GPU nodes.
Managing and Monitoring your jobs
Some useful commands:
sbatch |
sbatch <job script> |
submit a batch job to the queue |
squeue |
squeue |
show status of Slurm batch jobs |
scancel |
scancel JOBID |
cancel job |
sinfo |
sinfo |
show information about partitions |
scontrol |
scontrol show job JOBID |
used to check the status of a running, or idle job |
The default length of any job submitted to the queue is currently set at one hour and the default maximum number of processors per user is set to a value based on their user status.
Information on Submitting Jobs to the Queue
- Jobs that run longer than thirty days will be terminated by the scheduler.
- These parameters are subject to change as we become more familiar with users needs.
- It is important for users to specify the resources required by their jobs. The default is 1 CPU, 8GB memory, on a single node.
- In the current configuration, the walltime and the number of nodes are the two parameters that matter.
- If you don’t specify the walltime, the system default of one hour will be assumed and your job may be terminated early.
- See the Single Processor Job Example for further details
-
Scripts initiated by sbatch will have the environment which exists when you run the sbatch command, unless #!/bin/bash -l
is used, in which case the jobs has the environment of a fresh login to Discovery (recommended).
Information for Multiprocessor Jobs
- For multiprocessor jobs, it is important to specify the number of nodes and processors required and to select nodes that are of the same architecture.
- The nodes are divided into partitions. The nodes in each partition are homogeneous with similar chip vendors and speed, as well as disk and memory size.
- See the Sample parallel job scripts for examples of how to submit parallel jobs. Parallel programs that need to communicate between processes will run more efficiently if all of the processes are in the same group. In general, only MPI jobs know how to utilize multiple nodes
- For example:
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=4
- Before you submit your job, use the sinfo command to see which nodes are currently running jobs so you can select a partition that has free nodes.
- Programs using OpenMP or similar multithreading technology can use multiple CPUs for a single task. For these jobs, leave the nodes at the default (1), and use e.g.
#SBATCH --cpus-per-task=4
INTERACTIVE JOBS
An interactive job is a job that returns a command line prompt (instead of running a script) when the job runs. Interactive jobs are useful when debugging or interacting with an application. The srun command is used to submit an interactive job to Slurm. When the job starts, a command line prompt will appear on one of the compute nodes assigned to the job. From here commands can be executed using the resources allocated on the local node.
[john@discovery ~]$ srun --acount=rc --pty /bin/bash
[john@p04 ~]$ hostname
p04.hpcc.dartmouth.edu
[john@p04 ~]$
Jobs submitted with srun –pty /bin/bash will be assigned the cluster default values of 1 CPU and 1024MB of memory. The account must also be specified; the job will not run otherwise. If additional resources are required, they can be requested as options to the srun command. The following example job is assigned 2 nodes with 2 CPUS and 4GB of memory each:
srun --nodes=2 --ntasks-per-node=4 --mem-per-cpu=1GB --cpus-per-task=1 --account=rc --pty /bin/bash
[john@q06 ~]$