Submitting CUDA or OpenACC Jobs

Available GPUs

Crane has four types of GPUs available in the gpu partition. The type of GPU is configured as a SLURM feature, so you can specify a type of GPU in your job resource requirements if necessary.

Description SLURM Feature Available Hardware
Tesla K20, non-IB gpu_k20 3 nodes - 2 GPUs with 4 GB mem per node
Teska K20, with IB gpu_k20 3 nodes - 3 GPUs with 4 GB mem per node
Tesla K40, with IB gpu_k40 5 nodes - 4 K40M GPUs with 11 GB mem per node
1 node - 2 K40C GPUs
Tesla P100, with OPA gpu_p100 2 nodes - 2 GPUs with 12 GB per node
Tesla V100, with 10GbE gpu_v100 1 node - 4 GPUs with 16 GB per node

To run your job on the next available GPU regardless of type, add the following options to your srun or sbatch command:

--partition=gpu --gres=gpu

To run on a specific type of GPU, you can constrain your job to require a feature. To run on K40 GPUs for example:

--partition=gpu --gres=gpu --constraint=gpu_k40

You may request multiple GPUs by changing the--gres value to --gres=gpu:2. Note that this value is per node. For example, --nodes=2 --gres=gpu:2will request 2 nodes with 2 GPUs each, for a total of 4 GPUs.

Compiling

Compilation of CUDA or OpenACC jobs must be performed on the GPU nodes. Therefore, you must run an interactive job to compile. An example command to compile in the gpu partition could be:

$ srun --partition=gpu --gres=gpu --mem-per-cpu=1024 --ntasks-per-node=6 --nodes=1 --pty $SHELL

The above command will start a shell on a GPU node with 6 cores and 6GB of ram in order to compile a GPU job.  The above command could also be useful if you want to run a test GPU job interactively.

Submitting Jobs

CUDA and OpenACC submissions require running on GPU nodes.

cuda.submit
#!/bin/sh
#SBATCH --time=03:15:00
#SBATCH --mem-per-cpu=1024
#SBATCH --job-name=cuda
#SBATCH --partition=gpu
#SBATCH --gres=gpu
#SBATCH --error=/work/[groupname]/[username]/job.%J.err
#SBATCH --output=/work/[groupname]/[username]/job.%J.out

module load cuda/8.0
./cuda-app.exe

OpenACC submissions require loading the PGI compiler (which is currently required to compile as well).

openacc.submit
#!/bin/sh
#SBATCH --time=03:15:00
#SBATCH --mem-per-cpu=1024
#SBATCH --job-name=cuda-acc
#SBATCH --partition=gpu
#SBATCH --gres=gpu
#SBATCH --error=/work/[groupname]/[username]/job.%J.err
#SBATCH --output=/work/[groupname]/[username]/job.%J.out


module load cuda/8.0 compiler/pgi/16
./acc-app.exe