Example: Interactive GPU submission

Occasionally you may need to run an interactive job on a GPU node to test the code that you are using on hardware which is GPU aware. You can query what GPU resources are available with sinfo -O gres -p <name of queue>

$ sinfo -O gres
GRES
gpu:k80:4(S:1)
gpu:v100:4(S:0-1)

From this output we can see that k80's and v100's are available. Now we can submit an interactive job requesting those specific resource. For example, if we wanted k80 GPU's we would submit our interactive job like:

srun -p gpuq --gres=gpu:k80:1 --pty /bin/bash

From the output of nvidia-smi we can see that we have been assigned two k80 GPUs:

$ srun -p gpuq --gres=gpu:k80:2 --pty /bin/bash
[g03]$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:8A:00.0 Off |                    0 |
| N/A   34C    P8    26W / 149W |      0MiB / 11441MiB |      0%   E. Process |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K80           Off  | 00000000:8B:00.0 Off |                    0 |
| N/A   30C    P8    31W / 149W |      0MiB / 11441MiB |      0%   E. Process |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

[@g03 ~]$ echo $CUDA_VISIBLE_DEVICES
0,1

0 reviews

Print Article

Updating...

Example: Interactive GPU submission

Deleting...