Table of Contents
What MPI Is (and What It Is Not)
MPI (Message Passing Interface) is a standardized API for writing distributed-memory parallel programs, typically in C, C++, and Fortran.
Key points:
- Standard, not a language: MPI is a specification. Implementations like MPICH, Open MPI, Intel MPI follow the same standard.
- Library-based: You write normal C/C++/Fortran programs and link against an MPI library.
- Message passing: Parallelism is achieved by multiple processes that explicitly send and receive messages across a network.
- Distributed memory focus: Each process has its own address space. There is no shared memory between MPI processes.
MPI is designed to be:
- Portable: Same code should run on laptops, clusters, and supercomputers (with a suitable MPI library installed).
- Scalable: It can support thousands to hundreds of thousands of processes (and more).
- Language-agnostic (within its scope): Official bindings for C and Fortran; C++ bindings were removed from MPI-3.
MPI is not:
- A shared-memory threading model (that’s OpenMP, threads, etc.).
- An automatic parallelizer. You must decide what to send where and when.
- A specific implementation (Open MPI, MPICH, etc. are implementations of the MPI standard).
Basic MPI Programming Model
MPI follows a SPMD (Single Program, Multiple Data) style by default:
- You write one program.
- The program is started as many times as you request (e.g., 4 processes).
- Each instance is called an MPI process.
- The processes execute the same code but can behave differently depending on their rank.
Core ideas:
- Process: An independent OS process running your MPI program.
- Rank: An integer that identifies a process within a group (communicator). Typically ranks are numbered
0, 1, 2, ..., N-1. - Communicator: A group of processes that can communicate with each other. The default is
MPI_COMM_WORLD, which contains all processes in your job.
Because processes do not share memory, all inter-process cooperation happens through explicit message passing:
- Point-to-point: One process sends a message to exactly one other process.
- Collective: A group of processes participates in an operation (broadcast, reduction, etc.).
This chapter focuses on the basic MPI concepts you need to understand the rest of the MPI-related chapters.
Setting Up and Running MPI Programs
You normally compile and run MPI programs using wrappers and launchers provided by the MPI implementation.
Typical commands (names may vary slightly per MPI library):
- Compilation:
- C:
mpicc mycode.c -o myprog - C++:
mpicxx mycode.cpp -o myprog - Fortran:
mpif90 mycode.f90 -o myprog
These wrapper compilers:
- Add the right include paths (e.g.,
mpi.hfor C). - Link against the correct MPI libraries.
- Execution:
You do not normally run MPI programs with ./myprog directly. Instead, you launch multiple processes with an MPI launcher. On many systems:
mpirun -np 4 ./myprogmpiexec -n 4 ./myprog
On HPC clusters, this is often integrated with the batch scheduler (e.g., SLURM). There you may use:
srun ./myprog(with job script options specifying number of tasks/processes)
Details of interaction with job schedulers are handled in the job scheduling chapters; here the main concept is that the MPI runtime is responsible for starting the requested number of processes and connecting them.
Minimal Structure of an MPI Program
Almost every MPI program in C/C++/Fortran has the same basic skeleton:
- Initialize MPI
- Query the environment (number of processes, rank, etc.)
- Perform communication and computation
- Finalize MPI
Initialization and Finalization
In C-style pseudocode:
#include <mpi.h>
int main(int argc, char *argv[]) {
MPI_Init(&argc, &argv); // 1. Initialize
// ... your MPI code here ...
MPI_Finalize(); // 4. Finalize
return 0;
}Key points:
MPI_Initmust be called before any other MPI function.MPI_Finalizemust be called last, after all communication is done.- Both must be called by all MPI processes in the communicator (commonly
MPI_COMM_WORLD).
In Fortran, the pattern is similar (with call MPI_INIT(ierr) and call MPI_FINALIZE(ierr)).
Discovering the Parallel Environment
You typically begin by asking:
- How many processes are there?
- Which process am I?
In C:
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);MPI_Comm_size(comm, &size)returns the number of processes in the communicatorcomm.MPI_Comm_rank(comm, &rank)returns the calling process’s rank incomm.
These two queries are enough to let you:
- Assign different tasks to different ranks.
- Partition data between processes.
- Choose communication patterns based on rank.
A First MPI Program: "Hello, World"
A classic first example is a parallel "Hello, world" where each process identifies itself.
C example:
#include <stdio.h>
#include <mpi.h>
int main(int argc, char *argv[]) {
MPI_Init(&argc, &argv);
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
printf("Hello from rank %d out of %d processes\n",
world_rank, world_size);
MPI_Finalize();
return 0;
}Key observations:
- The program is identical for all processes.
- Output lines differ because
world_rankis different on each process. - When run with
-np 4, you will see four lines, one from each rank.
Basic MPI Data Types and Messages (High-Level View)
MPI functions often need to know:
- Buffer address: Where the data is in memory.
- Count: How many items.
- Datatype: What kind of items.
- Destination/source: Which rank to send to or receive from.
- Tag: A small integer label to help match messages.
- Communicator: The group of processes involved in this communication.
For example, most point-to-point operations take arguments of the general form:
buffercountdatatype(e.g.,MPI_INT,MPI_DOUBLE)destination_rank/source_ranktagcommunicator(e.g.,MPI_COMM_WORLD)
MPI provides its own datatypes (e.g., MPI_INT, MPI_DOUBLE, MPI_CHAR) to stay portable across machines where the size and representation of C and Fortran types may differ.
Basic Point-to-Point Communication Concepts
Details of point-to-point functions are covered in the corresponding chapter; here the aim is to introduce the ideas in abstract terms.
Core concepts:
- Send: A process sends data from its memory to another process.
- Receive: A process receives incoming data into its memory.
- Matching: A receive operation specifies from whom to receive (or “any source”), and what tag to expect (or “any tag”).
Typical operations (names only, not full explanation here):
MPI_SendMPI_Recv- Nonblocking variants like
MPI_Isend,MPI_Irecv(introduced later in more detail)
Example pattern (conceptual):
- Rank 0 sends data to rank 1 using a send call.
- Rank 1 posts a receive call to get data from rank 0.
Correctness depends on matching:
- Same communicator.
- Compatible datatypes and counts.
- Matching tags and ranks (unless wildcards like
MPI_ANY_SOURCEorMPI_ANY_TAGare used).
Basic Collective Operations (Conceptual Overview)
MPI also defines collective communication routines, which involve all processes in a communicator.
These are higher-level than point-to-point operations. Examples (to be covered in detail in their own chapter):
- Broadcast (Bcast): One process (root) sends the same data to all others.
- Gather: Processes send data to a root, which collects them.
- Scatter: Root divides data and sends a part to each process.
- Allgather / Alltoall: All processes exchange data with all others.
- Reduction (Reduce, Allreduce): Apply an operation (sum, max, min, etc.) across all processes and return the result.
At this stage, you only need to recognize that:
- Collectives save you from manually writing many sends/receives.
- They are designed to be efficient and scalable on HPC systems.
Communicators and Groups: The Context for Communication
All MPI communications happen within communicators. Conceptually:
- A communicator is a handle that represents:
- A set of processes (a group).
- A communication context (messages in one communicator do not mix with another).
MPI_COMM_WORLDis the default communicator containing all processes started in your job.
Reasons communicators are important:
- They allow you to partition your MPI job into subgroups, each with its own communication.
- They help prevent interference between different parts of a large application or between different libraries used inside the same MPI program.
Later chapters will cover:
- How to create new communicators.
- How communicators are used in process topologies.
For now, you just need to recognize that nearly every MPI call asks you for a communicator, and MPI_COMM_WORLD is the simplest and most common choice in small examples.
Basic Error Handling and Return Codes
Most MPI routines return an integer error code:
MPI_SUCCESSindicates that the routine completed successfully.- Other codes indicate specific error conditions.
In C, you typically write:
int err = MPI_SomeCall(...);
if (err != MPI_SUCCESS) {
// handle error (or abort)
}For small educational codes, you might ignore error codes, but in real applications, checking them helps diagnose issues such as:
- Invalid arguments (e.g., negative counts).
- Mismatched datatypes or communicators.
- Use of MPI after
MPI_Finalize.
MPI also provides ways to change error handlers on communicators (e.g., to abort all processes on error), which is covered more thoroughly in advanced usage.
MPI Standards and Versions (High-Level)
The MPI standard has evolved:
- MPI-1: Core functionality (point-to-point, collectives, basic datatypes, communicators).
- MPI-2: Added parallel I/O, dynamic process management, some one-sided communication.
- MPI-3: Extended one-sided communication, new collectives, removed official C++ bindings.
- MPI-4: New features like improved tools interfaces, generalized requests, additional collectives, and more.
For introductory usage:
- The basic send/receive and collectives you’ll encounter first are present and stable across all recent MPI versions.
- You should write code that adheres to the standard, not tied to a specific implementation.
On many HPC systems, you can check the version of MPI by running:
mpirun --versionormpiexec --version- Or programmatically via
MPI_Get_version(for use in code, not covered in detail here).
Typical Development Workflow with MPI
To summarize how MPI fits into a basic workflow:
- Write code:
- Include MPI headers (e.g.,
#include <mpi.h>). - Add
MPI_InitandMPI_Finalize. - Use
MPI_Comm_sizeandMPI_Comm_rankto discover the environment. - Insert appropriate MPI communication calls (to be learned in following chapters).
- Compile with MPI wrappers:
- Use
mpicc,mpicxx, ormpif90. - Run with an MPI launcher:
- On a workstation:
mpirun -np N ./a.out - On a cluster: via the job scheduler (using
srun,mpiruninside job scripts, etc.). - Debug and profile:
- Use standard debugging techniques plus MPI-specific tools (covered in later chapters).
What You Should Be Able to Do After This Chapter
After studying this introduction, you should be able to:
- Explain what MPI is and why it is central to distributed-memory programming.
- Recognize the SPMD model and the role of ranks and communicators.
- Write a minimal MPI program that:
- Initializes and finalizes MPI.
- Queries the number of processes and the calling process’s rank.
- Prints some rank-dependent output.
- Understand, at a conceptual level, the difference between:
- Point-to-point communication (send/receive).
- Collective communication (broadcast, reduce, etc.).
MPI_COMM_WORLDvs other communicators.
Subsequent chapters will build on this foundation to cover point-to-point communication, collectives, communicators, process topologies, and performance aspects in much more detail.