In this lab we will use openMPI to compile a very basic "Hello World!" script, which we will then submit to run across multiple compute nodes.
Once you have logged into the discovery cluster. The first thing you will want to do, is load the mpi libraries into your path with the openMPI module. You can do this by:
$ module load openmpi
Next we need to create the MPI script. With your favorite editor, create a new file called "sample_mpi_hello_world.c"
#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
MPI_Init(NULL, NULL);
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
char processor_name[MPI_MAX_PROCESSOR_NAME];
int name_len;
MPI_Get_processor_name(processor_name, &name_len);
// Print a hello world message
printf("Hello world from processor %s, rank %d out of %d processors\n",
processor_name, world_rank, world_size);
MPI_Finalize();
}
Next we compile the script via:
$ mpicc -o sample_mpi_hello_world sample_mpi_hello_world.c
Once complete, the program has been compiled. You can test the program by trying to run it across 4 CPU's like this:
$ mpirun -n 4 sample_mpi_hello_world
Hello world from processor discovery7.hpcc.dartmouth.edu, rank 0 out of 4 processors
Hello world from processor discovery7.hpcc.dartmouth.edu, rank 1 out of 4 processors
Hello world from processor discovery7.hpcc.dartmouth.edu, rank 2 out of 4 processors
Hello world from processor discovery7.hpcc.dartmouth.edu, rank 3 out of 4 processors
Now that we have a working mpi program, the next step is to launch it as a batch job. Here is an example batch script you can use which requests multiple CPU's on multiple nodes. We named it "sample_mpi.sh" for this example.
#!/bin/bash -l
# Name of the partition / queue
#SBATCH --partition=standard
# Name of the cluster account
# How long should I job run for
#SBATCH --time=01:00:00
# Number of CPU cores, in this case 4 cores
#SBATCH --ntasks=4
# Number of compute nodes to use, in this case 2
#SBATCH --nodes=2
# Name of the output files to be created. If not specified the outputs will be joined
#SBATCH --output=%x.%j.out
#SBATCH --error=%x.%j.err
# The code you want to run in your job
mpirun -n 4 ./sample_mpi_hello_world
Once we have our submit script created, we can submit it to the cluster via sbatch:
$ sbatch sample_mpi.sh
Submitted batch job 3977
To see the job running, type the squeue
command and look for the above job ID.
When the job runs it will create two output files:
sample_mpi_hello_world.sh.3977.out
sample_mpi_hello_world.sh.3977.err
We can see from the output file, "sample_mpi_hello_world.sh.3977.out" that the job ran on 4 cores across 2 compute nodes.
$ cat sample_mpi_hello_world.sh.3977.out
Hello world from processor q06.hpcc.dartmouth.edu, rank 1 out of 4 processors
Hello world from processor q06.hpcc.dartmouth.edu, rank 0 out of 4 processors
Hello world from processor q07.hpcc.dartmouth.edu, rank 2 out of 4 processors
Hello world from processor q07.hpcc.dartmouth.edu, rank 3 out of 4 processors