How to submit an MPI parallel job to SLURM
This article explains the submission of an MPI parallel job to SLURM/ LOTUS. It covers:
- What is an MPI parallel job?
- MPI implementation and SLURM
- Parallel MPI job submission
What is an MPI parallel job?
An MPI parallel job runs on more than one core and more than one host using the Message Passing Interface (MPI) library for communication between all cores. A simple script, such as the one given below "my_script_name.sbatch
#!/bin/bash #SBATCH -p par-multi #SBATCH -n 36 #SBATCH -t 30 #SBATCH -o %j.log #SBATCH -e %j.err # Load a module for the gcc OpenMPI library (needed for mpi_myname.exe) module load eb/OpenMPI/gcc/4.0.0 # Start the job running using OpenMPI's "mpirun" job launcher mpirun ./mpi_myname.exe
-n refers to the number of processors or cores you wish to run on. The rest of
$ sbatch --exclusive my_script_name.sbatch
MPI implementation and SLURM
The OpenMPI library is the only supported MPI library on the cluster. OpenMPI v3.1.1 and v4.0.0 are provided which are fully MPI3 compliant. MPI I/O features are fully supported *only* on the LOTUS /work/scratch-pw directory as this uses a Panasas fully parallel file system. The MPI implementation on CentOS7 LOTUS/SLURM is available via the module environment for each compiler as listed below:
eb/OpenMPI/gcc/3.1.1 eb/OpenMPI/gcc/4.0.0 eb/OpenMPI/intel/3.1.1
Parallel MPI compiler with OpenMPI
module load intel/20.0.0 module load eb/OpenMPI/intel/3.1.1 mpif90
will use the Intel Fortran compiler
ifort and OpenMPI/3.1.1.
module load eb/OpenMPI/gcc/3.1.1 mpicc
will call the GNU C compiler
The OpenMPI User Guides can be found at https://www.open-mpi.org/doc/ .