Occasionally you may need to run an interactive job on a GPU node to test the code that you are using on hardware which is GPU aware. You can query what GPU resources are available with sinfo -O gres -p <name of queue>
$ sinfo -O gres
GRES
gpu:k80:4(S:1)
gpu:v100:4(S:0-1)
From this output we can see that k80's and v100's are available. Now we can submit an interactive job requesting those specific resource. For example, if we wanted k80 GPU's we would submit our interactive job like:
srun -p gpuq --gres=gpu:k80:1 --pty /bin/bash
From the output of nvidia-smi we can see that we have been assigned two k80 GPUs:
$ srun -p gpuq --gres=gpu:k80:2 --pty /bin/bash
[g03]$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00000000:8A:00.0 Off | 0 |
| N/A 34C P8 26W / 149W | 0MiB / 11441MiB | 0% E. Process |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K80 Off | 00000000:8B:00.0 Off | 0 |
| N/A 30C P8 31W / 149W | 0MiB / 11441MiB | 0% E. Process |
| | | N/A |
+-------------------------------+----------------------+----------------------+
[@g03 ~]$ echo $CUDA_VISIBLE_DEVICES
0,1