Table of Contents
Understanding Communicators in MPI
In MPI, communicators define which processes can talk to each other, and how they are grouped. Almost everything you do with MPI uses a communicator, whether you notice it or not.
This chapter focuses on:
- What a communicator is
- The difference between groups and communicators
- Intracommunicators vs intercommunicators
- Creating and splitting communicators
- Duplicating communicators for library safety
- Practical use cases and common patterns
We assume basic familiarity with MPI processes and simple point-to-point / collective operations from previous sections.
What is a Communicator?
Conceptually, a communicator is:
- A set of MPI processes (a group),
- Together with the communication context (a tag space that’s private to that communicator),
- And a local numbering of processes within that communicator (ranks:
0, 1, ..., size-1).
Most MPI calls take a communicator argument, for example:
MPI_Comm_size(comm, &size);MPI_Comm_rank(comm, &rank);MPI_Bcast(..., comm);MPI_Send(..., dest, tag, comm);
The communicator:
- Defines which processes are participating in the operation.
- Defines which ranks are valid (e.g.
destinMPI_Send). - Isolates communications: messages sent on one communicator cannot be accidentally received on another.
The default communicator you almost always start with is:
MPI_COMM_WORLD— all processes in your MPI job.
Groups vs Communicators
MPI distinguishes between:
- Groups (
MPI_Group): - Purely a set of process identifiers (relative to some parent group).
- No communication operations are done on groups.
- You use them to define subsets or reorderings of processes.
- Communicators (
MPI_Comm): - A group plus a communication context.
- All point-to-point and collective operations require a communicator, not a group.
Typical workflow:
- Start from the group of an existing communicator (often
MPI_COMM_WORLD). - Manipulate that group to build subgroups.
- Build new communicators from those groups.
Essential API pieces (conceptual, not full details):
MPI_Comm_group(comm, &group);— get group from communicator.MPI_Group_incl(world_group, n, ranks, &new_group);— build new group from selected ranks.MPI_Comm_create(old_comm, new_group, &new_comm);— communicator from group.
Groups determine membership; communicators determine membership + messaging context.
Intracommunicators and Intercommunicators
MPI defines two main kinds of communicators:
Intracommunicators
- The most common type.
- All processes in the communicator form one group.
- Ranks are
0tosize-1. - Used for “normal” communication within a single MPI job partition.
Examples:
MPI_COMM_WORLD- A communicator for all processes on a node
- A communicator for a process grid in a domain decomposition
Almost all introductory MPI code uses intracommunicators.
Intercommunicators
- Connect two disjoint groups of processes.
- Processes in one group can only address processes in the other group.
- Used for more advanced patterns, such as:
- Client–server models
- Dynamic process management (
MPI_Comm_spawnand friends) - Coupling separate MPI applications
For a beginner course, it’s enough to recognize that:
- You will typically use intracommunicators.
- Intercommunicators exist for more advanced or modular use cases.
Creating Communicators: Custom Subsets of Processes
Using only MPI_COMM_WORLD is simple but often suboptimal. Many algorithms need subgroups of processes to coordinate independently. Communicators are how you express that in MPI.
Why Create New Communicators?
Typical reasons:
- Different phases of a program use different subsets of ranks.
- You want to run independent collective operations simultaneously on disjoint sets of processes.
- You want to map processes to the structure of your problem, such as:
- A 1D, 2D, or 3D process grid
- “Color” or “role” (I/O processes, compute processes, etc.)
- You need to separate different algorithm stages or components so that:
- Their collective operations don’t interfere.
- Their tag spaces are isolated, avoiding accidental message mismatches.
Basic Pattern: From Group to Communicator
General procedure:
- Get group of a starting communicator (often
MPI_COMM_WORLD). - Create a subgroup.
- Create a communicator from the subgroup.
Illustrative example (conceptual):
MPI_Comm world = MPI_COMM_WORLD;
MPI_Comm subcomm;
MPI_Group world_group, sub_group;
MPI_Comm_group(world, &world_group);
// Suppose we want a communicator containing ranks 0..3 only
int ranks[4] = {0, 1, 2, 3};
MPI_Group_incl(world_group, 4, ranks, &sub_group);
// Create communicator for that group
MPI_Comm_create(world, sub_group, &subcomm);
// Processes not in sub_group get MPI_COMM_NULL in subcommKey points:
- Processes not belonging to
sub_groupwill receiveMPI_COMM_NULLas the new communicator. - You must check if
subcomm != MPI_COMM_NULLbefore using it. - Collective calls on
subcomminvolve only processes in that subgroup.
Splitting Communicators: `MPI_Comm_split`
MPI_Comm_split is the main, high-level tool for building subcommunicators from an existing one.
Conceptually, every process in the original communicator calls:
MPI_Comm_split(old_comm, color, key, &new_comm);Parameters:
color:- Determines subcommunicator membership.
- All processes that pass the same
colorvalue end up in the same new communicator. - A process may use
MPI_UNDEFINEDascolorto indicate it does not join any new communicator (returnsMPI_COMM_NULL). key:- Determines the ordering (rank assignment) within that new communicator.
- Processes in the same color are sorted by
key, giving them ranks0tosubsize-1. new_comm:- The resulting communicator for that
color.
Example: Split by Even/Odd Rank
Suppose we want:
- One communicator for all processes with even ranks in
MPI_COMM_WORLD. - Another for all odd ranks.
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
int color = world_rank % 2; // 0 for even ranks, 1 for odd
MPI_Comm even_odd_comm;
MPI_Comm_split(MPI_COMM_WORLD, color, world_rank, &even_odd_comm);
// Now there are two communicators:
// - color 0: ranks {0, 2, 4, ...} (with new ranks 0..neven-1)
// - color 1: ranks {1, 3, 5, ...} (with new ranks 0..nodd-1)Within each of these new communicators:
MPI_Comm_size(even_odd_comm, &size);MPI_Comm_rank(even_odd_comm, &rank);
refer to the size and rank inside that subcommunicator, not in MPI_COMM_WORLD.
Example: Split by Role
Common pattern: assign roles to processes and create a communicators per role.
// For example, rank 0 is "I/O master", others are "compute"
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
int color;
if (world_rank == 0) {
color = 0; // IO group
} else {
color = 1; // compute group
}
MPI_Comm role_comm;
MPI_Comm_split(MPI_COMM_WORLD, color, world_rank, &role_comm);
// Now you can do collectives independently within IO or compute groups.This is useful for:
- I/O aggregation (only some processes handle filesystem I/O).
- Heterogeneous work assignments.
- Staging and coupling multi-physics solvers.
Duplicating Communicators: `MPI_Comm_dup`
Different parts of a code (or different libraries) may want to use the same set of processes but avoid interfering with each other’s messages or collectives. MPI_Comm_dup creates a new communicator with:
- The same group of processes,
- But a distinct communication context.
Conceptually:
MPI_Comm new_comm;
MPI_Comm_dup(old_comm, &new_comm);Properties:
new_commhas the same size and processes asold_comm.- Ranks in
new_commmatch ranks inold_comm. - Messages sent on
old_commcannot accidentally match receives onnew_command vice versa.
Typical use:
- Library codes often take a communicator as input and then call
MPI_Comm_dupinternally. - This isolates library’s internal communications from user’s communications.
Usage pattern in a library (simplified):
void my_library_init(MPI_Comm user_comm, my_lib_handle *h)
{
MPI_Comm_dup(user_comm, &h->lib_comm);
// Now library uses h->lib_comm for all MPI calls
}Using `MPI_COMM_NULL`
Some communicator creation calls may yield MPI_COMM_NULL on some ranks:
MPI_Comm_create:- Processes not in the selected group get
MPI_COMM_NULL. MPI_Comm_split:- Processes that pass
color = MPI_UNDEFINEDgetMPI_COMM_NULL.
Any MPI calls that expect a valid communicator must not be given MPI_COMM_NULL. Always check:
if (new_comm != MPI_COMM_NULL) {
MPI_Comm_rank(new_comm, &subrank);
MPI_Comm_size(new_comm, &subsize);
// safe MPI operations on new_comm
}This allows you to:
- Logically “remove” processes from a computation stage.
- Avoid unnecessary participation in collective operations.
Practical Use Cases for Custom Communicators
1. Decomposing the Domain
In structured grid problems, you might:
- Create a 2D or 3D Cartesian communicator (see process topologies section).
- Use subcommunicators for:
- Rows vs columns
- Slices in one dimension
- Perform collectives only within a line or plane of processes.
Communicators encode the geometry of your process layout, matching the problem geometry.
2. Overlapping Multiple Algorithms
A single MPI job may run multiple algorithms in parallel, each on a subset of processes:
- Algorithm A uses ranks
0..p-1. - Algorithm B uses ranks
p..P-1.
Separate communicators avoid conflicts when each algorithm calls its own collectives or uses overlapping tags.
3. Resource-Aware Grouping
You can align communicators with hardware resources:
- One communicator per node or socket.
- One communicator per GPU or NUMA domain.
This supports node-local or socket-local collectives, which may be more efficient (e.g., shared-memory aware operations).
4. Modular Software Design
Large codes composed of multiple modules or libraries often:
- Pass communicators between modules.
- Duplicate communicators per module to isolate contexts.
- Use subcommunicators for separate physics components or solver stages.
Communicators thus support composability: independent pieces of code use MPI safely in the same job.
Communicators and Collectives
Collective operations are always scoped to a communicator:
- All processes in that communicator must participate (unless explicitly stated otherwise in MPI standard).
- Processes not in that communicator must not call that collective.
This implies:
- You cannot “partially” call a collective on
MPI_COMM_WORLDwith only half the ranks. - If you want a collective over a subset, you must create an appropriate communicator and call the collective on that communicator.
Example pattern:
- Build subcommunicator for the relevant subset.
- Call collective only on that communicator.
// Suppose subcomm contains only the processes that need to participate
if (subcomm != MPI_COMM_NULL) {
MPI_Bcast(data, count, MPI_DOUBLE, root_in_subcomm, subcomm);
}This is central to writing correct and scalable distributed-memory codes.
Lifetime and Cleanup
Communicators are MPI objects that you should free when no longer needed:
MPI_Comm_free(&subcomm);Notes:
- After
MPI_Comm_free, the communicator handle becomes invalid. - You must ensure that no MPI operations use it after it is freed.
- For dynamically created communicators in long-running jobs, proper cleanup avoids exhausting internal MPI resources.
MPI_COMM_WORLD and certain predefined communicators are managed by MPI and must not be freed.
Summary
- A communicator defines:
- A group of processes,
- A communication context,
- And the rank numbering within that group.
MPI_COMM_WORLDis the global communicator, but nontrivial applications almost always define additional communicators.- Groups are building blocks; communicators are what you actually use for communication.
MPI_Comm_splitis a powerful way to partition processes into role-based or geometry-based subcommunicators.MPI_Comm_dupcreates a safe, isolated context for libraries and modules without changing membership.- Proper design of communicators enables:
- Clean modular code,
- Safe concurrent use of MPI by different components,
- Efficient, localized collectives on subsets of processes.