8.1 Introduction to MPI

Table of Contents

What MPI Is (and What It Is Not)

MPI (Message Passing Interface) is a standardized API for writing distributed-memory parallel programs, typically in C, C++, and Fortran.

Key points:

Standard, not a language: MPI is a specification. Implementations like MPICH, Open MPI, Intel MPI follow the same standard.
Library-based: You write normal C/C++/Fortran programs and link against an MPI library.
Message passing: Parallelism is achieved by multiple processes that explicitly send and receive messages across a network.
Distributed memory focus: Each process has its own address space. There is no shared memory between MPI processes.

MPI is designed to be:

Portable: Same code should run on laptops, clusters, and supercomputers (with a suitable MPI library installed).
Scalable: It can support thousands to hundreds of thousands of processes (and more).
Language-agnostic (within its scope): Official bindings for C and Fortran; C++ bindings were removed from MPI-3.

MPI is not:

A shared-memory threading model (that’s OpenMP, threads, etc.).
An automatic parallelizer. You must decide what to send where and when.
A specific implementation (Open MPI, MPICH, etc. are implementations of the MPI standard).

Basic MPI Programming Model

MPI follows a SPMD (Single Program, Multiple Data) style by default:

You write one program.
The program is started as many times as you request (e.g., 4 processes).
Each instance is called an MPI process.
The processes execute the same code but can behave differently depending on their rank.

Core ideas:

Process: An independent OS process running your MPI program.
Rank: An integer that identifies a process within a group (communicator). Typically ranks are numbered 0, 1, 2, ..., N-1.
Communicator: A group of processes that can communicate with each other. The default is MPI_COMM_WORLD, which contains all processes in your job.

Because processes do not share memory, all inter-process cooperation happens through explicit message passing:

Point-to-point: One process sends a message to exactly one other process.
Collective: A group of processes participates in an operation (broadcast, reduction, etc.).

This chapter focuses on the basic MPI concepts you need to understand the rest of the MPI-related chapters.

Setting Up and Running MPI Programs

You normally compile and run MPI programs using wrappers and launchers provided by the MPI implementation.

Typical commands (names may vary slightly per MPI library):

Compilation:

C: mpicc mycode.c -o myprog
C++: mpicxx mycode.cpp -o myprog
Fortran: mpif90 mycode.f90 -o myprog

These wrapper compilers:

Add the right include paths (e.g., mpi.h for C).
Link against the correct MPI libraries.
Execution:

You do not normally run MPI programs with ./myprog directly. Instead, you launch multiple processes with an MPI launcher. On many systems:

mpirun -np 4 ./myprog
mpiexec -n 4 ./myprog

On HPC clusters, this is often integrated with the batch scheduler (e.g., SLURM). There you may use:

srun ./myprog (with job script options specifying number of tasks/processes)

Details of interaction with job schedulers are handled in the job scheduling chapters; here the main concept is that the MPI runtime is responsible for starting the requested number of processes and connecting them.

Minimal Structure of an MPI Program

Almost every MPI program in C/C++/Fortran has the same basic skeleton:

Initialize MPI
Query the environment (number of processes, rank, etc.)
Perform communication and computation
Finalize MPI

Initialization and Finalization

In C-style pseudocode:

#include <mpi.h>
int main(int argc, char *argv[]) {
    MPI_Init(&argc, &argv);   // 1. Initialize
    // ... your MPI code here ...
    MPI_Finalize();           // 4. Finalize
    return 0;
}

Key points:

MPI_Init must be called before any other MPI function.
MPI_Finalize must be called last, after all communication is done.
Both must be called by all MPI processes in the communicator (commonly MPI_COMM_WORLD).

In Fortran, the pattern is similar (with call MPI_INIT(ierr) and call MPI_FINALIZE(ierr)).

Discovering the Parallel Environment

You typically begin by asking:

How many processes are there?
Which process am I?

In C:

int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

MPI_Comm_size(comm, &size) returns the number of processes in the communicator comm.
MPI_Comm_rank(comm, &rank) returns the calling process’s rank in comm.

These two queries are enough to let you:

Assign different tasks to different ranks.
Partition data between processes.
Choose communication patterns based on rank.

A First MPI Program: "Hello, World"

A classic first example is a parallel "Hello, world" where each process identifies itself.

C example:

#include <stdio.h>
#include <mpi.h>
int main(int argc, char *argv[]) {
    MPI_Init(&argc, &argv);
    int world_size;
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);
    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
    printf("Hello from rank %d out of %d processes\n",
           world_rank, world_size);
    MPI_Finalize();
    return 0;
}

Key observations:

The program is identical for all processes.
Output lines differ because world_rank is different on each process.
When run with -np 4, you will see four lines, one from each rank.

Basic MPI Data Types and Messages (High-Level View)

MPI functions often need to know:

Buffer address: Where the data is in memory.
Count: How many items.
Datatype: What kind of items.
Destination/source: Which rank to send to or receive from.
Tag: A small integer label to help match messages.
Communicator: The group of processes involved in this communication.

For example, most point-to-point operations take arguments of the general form:

buffer
count
datatype (e.g., MPI_INT, MPI_DOUBLE)
destination_rank / source_rank
tag
communicator (e.g., MPI_COMM_WORLD)

MPI provides its own datatypes (e.g., MPI_INT, MPI_DOUBLE, MPI_CHAR) to stay portable across machines where the size and representation of C and Fortran types may differ.

Basic Point-to-Point Communication Concepts

Details of point-to-point functions are covered in the corresponding chapter; here the aim is to introduce the ideas in abstract terms.

Core concepts:

Send: A process sends data from its memory to another process.
Receive: A process receives incoming data into its memory.
Matching: A receive operation specifies from whom to receive (or “any source”), and what tag to expect (or “any tag”).

Typical operations (names only, not full explanation here):

MPI_Send
MPI_Recv
Nonblocking variants like MPI_Isend, MPI_Irecv (introduced later in more detail)

Example pattern (conceptual):

Rank 0 sends data to rank 1 using a send call.
Rank 1 posts a receive call to get data from rank 0.

Correctness depends on matching:

Same communicator.
Compatible datatypes and counts.
Matching tags and ranks (unless wildcards like MPI_ANY_SOURCE or MPI_ANY_TAG are used).

Basic Collective Operations (Conceptual Overview)

MPI also defines collective communication routines, which involve all processes in a communicator.

These are higher-level than point-to-point operations. Examples (to be covered in detail in their own chapter):

Broadcast (Bcast): One process (root) sends the same data to all others.
Gather: Processes send data to a root, which collects them.
Scatter: Root divides data and sends a part to each process.
Allgather / Alltoall: All processes exchange data with all others.
Reduction (Reduce, Allreduce): Apply an operation (sum, max, min, etc.) across all processes and return the result.

At this stage, you only need to recognize that:

Collectives save you from manually writing many sends/receives.
They are designed to be efficient and scalable on HPC systems.

Communicators and Groups: The Context for Communication

All MPI communications happen within communicators. Conceptually:

A communicator is a handle that represents:

A set of processes (a group).
A communication context (messages in one communicator do not mix with another).

MPI_COMM_WORLD is the default communicator containing all processes started in your job.

Reasons communicators are important:

They allow you to partition your MPI job into subgroups, each with its own communication.
They help prevent interference between different parts of a large application or between different libraries used inside the same MPI program.

Later chapters will cover:

How to create new communicators.
How communicators are used in process topologies.

For now, you just need to recognize that nearly every MPI call asks you for a communicator, and MPI_COMM_WORLD is the simplest and most common choice in small examples.

Basic Error Handling and Return Codes

Most MPI routines return an integer error code:

MPI_SUCCESS indicates that the routine completed successfully.
Other codes indicate specific error conditions.

In C, you typically write:

int err = MPI_SomeCall(...);
if (err != MPI_SUCCESS) {
    // handle error (or abort)
}

For small educational codes, you might ignore error codes, but in real applications, checking them helps diagnose issues such as:

Invalid arguments (e.g., negative counts).
Mismatched datatypes or communicators.
Use of MPI after MPI_Finalize.

MPI also provides ways to change error handlers on communicators (e.g., to abort all processes on error), which is covered more thoroughly in advanced usage.

MPI Standards and Versions (High-Level)

The MPI standard has evolved:

MPI-1: Core functionality (point-to-point, collectives, basic datatypes, communicators).
MPI-2: Added parallel I/O, dynamic process management, some one-sided communication.
MPI-3: Extended one-sided communication, new collectives, removed official C++ bindings.
MPI-4: New features like improved tools interfaces, generalized requests, additional collectives, and more.

For introductory usage:

The basic send/receive and collectives you’ll encounter first are present and stable across all recent MPI versions.
You should write code that adheres to the standard, not tied to a specific implementation.

On many HPC systems, you can check the version of MPI by running:

mpirun --version or mpiexec --version
Or programmatically via MPI_Get_version (for use in code, not covered in detail here).

Typical Development Workflow with MPI

To summarize how MPI fits into a basic workflow:

Write code:

Include MPI headers (e.g., #include <mpi.h>).
Add MPI_Init and MPI_Finalize.
Use MPI_Comm_size and MPI_Comm_rank to discover the environment.
Insert appropriate MPI communication calls (to be learned in following chapters).

Compile with MPI wrappers:

Use mpicc, mpicxx, or mpif90.

Run with an MPI launcher:

On a workstation: mpirun -np N ./a.out
On a cluster: via the job scheduler (using srun, mpirun inside job scripts, etc.).

Debug and profile:

Use standard debugging techniques plus MPI-specific tools (covered in later chapters).

What You Should Be Able to Do After This Chapter

After studying this introduction, you should be able to:

Explain what MPI is and why it is central to distributed-memory programming.
Recognize the SPMD model and the role of ranks and communicators.
Write a minimal MPI program that:

Initializes and finalizes MPI.
Queries the number of processes and the calling process’s rank.
Prints some rank-dependent output.

Understand, at a conceptual level, the difference between:

Point-to-point communication (send/receive).
Collective communication (broadcast, reduce, etc.).
MPI_COMM_WORLD vs other communicators.

Subsequent chapters will build on this foundation to cover point-to-point communication, collectives, communicators, process topologies, and performance aspects in much more detail.

Comments

Please login to add a comment.

Don't have an account? Register now!