8.5 Communicators

Understanding Communicators in MPI

In MPI, communicators define which processes can talk to each other, and how they are grouped. Almost everything you do with MPI uses a communicator, whether you notice it or not.

This chapter focuses on:

What a communicator is
The difference between groups and communicators
Intracommunicators vs intercommunicators
Creating and splitting communicators
Duplicating communicators for library safety
Practical use cases and common patterns

We assume basic familiarity with MPI processes and simple point-to-point / collective operations from previous sections.

What is a Communicator?

Conceptually, a communicator is:

A set of MPI processes (a group),
Together with the communication context (a tag space that’s private to that communicator),
And a local numbering of processes within that communicator (ranks: 0, 1, ..., size-1).

Most MPI calls take a communicator argument, for example:

MPI_Comm_size(comm, &size);
MPI_Comm_rank(comm, &rank);
MPI_Bcast(..., comm);
MPI_Send(..., dest, tag, comm);

The communicator:

Defines which processes are participating in the operation.
Defines which ranks are valid (e.g. dest in MPI_Send).
Isolates communications: messages sent on one communicator cannot be accidentally received on another.

The default communicator you almost always start with is:

MPI_COMM_WORLD — all processes in your MPI job.

Groups vs Communicators

MPI distinguishes between:

Groups (MPI_Group):

Purely a set of process identifiers (relative to some parent group).
No communication operations are done on groups.
You use them to define subsets or reorderings of processes.

Communicators (MPI_Comm):

A group plus a communication context.
All point-to-point and collective operations require a communicator, not a group.

Typical workflow:

Start from the group of an existing communicator (often MPI_COMM_WORLD).
Manipulate that group to build subgroups.
Build new communicators from those groups.

Essential API pieces (conceptual, not full details):

MPI_Comm_group(comm, &group); — get group from communicator.
MPI_Group_incl(world_group, n, ranks, &new_group); — build new group from selected ranks.
MPI_Comm_create(old_comm, new_group, &new_comm); — communicator from group.

Groups determine membership; communicators determine membership + messaging context.

Intracommunicators and Intercommunicators

MPI defines two main kinds of communicators:

Intracommunicators

The most common type.
All processes in the communicator form one group.
Ranks are 0 to size-1.
Used for “normal” communication within a single MPI job partition.

Examples:

MPI_COMM_WORLD
A communicator for all processes on a node
A communicator for a process grid in a domain decomposition

Almost all introductory MPI code uses intracommunicators.

Intercommunicators

Connect two disjoint groups of processes.
Processes in one group can only address processes in the other group.
Used for more advanced patterns, such as:

Client–server models
Dynamic process management (MPI_Comm_spawn and friends)
Coupling separate MPI applications

For a beginner course, it’s enough to recognize that:

You will typically use intracommunicators.
Intercommunicators exist for more advanced or modular use cases.

Creating Communicators: Custom Subsets of Processes

Using only MPI_COMM_WORLD is simple but often suboptimal. Many algorithms need subgroups of processes to coordinate independently. Communicators are how you express that in MPI.

Why Create New Communicators?

Typical reasons:

Different phases of a program use different subsets of ranks.
You want to run independent collective operations simultaneously on disjoint sets of processes.
You want to map processes to the structure of your problem, such as:

A 1D, 2D, or 3D process grid
“Color” or “role” (I/O processes, compute processes, etc.)

You need to separate different algorithm stages or components so that:

Their collective operations don’t interfere.
Their tag spaces are isolated, avoiding accidental message mismatches.

Basic Pattern: From Group to Communicator

General procedure:

Get group of a starting communicator (often MPI_COMM_WORLD).
Create a subgroup.
Create a communicator from the subgroup.

Illustrative example (conceptual):

MPI_Comm world = MPI_COMM_WORLD;
MPI_Comm subcomm;
MPI_Group world_group, sub_group;
MPI_Comm_group(world, &world_group);
// Suppose we want a communicator containing ranks 0..3 only
int ranks[4] = {0, 1, 2, 3};
MPI_Group_incl(world_group, 4, ranks, &sub_group);
// Create communicator for that group
MPI_Comm_create(world, sub_group, &subcomm);
// Processes not in sub_group get MPI_COMM_NULL in subcomm

Key points:

Processes not belonging to sub_group will receive MPI_COMM_NULL as the new communicator.
You must check if subcomm != MPI_COMM_NULL before using it.
Collective calls on subcomm involve only processes in that subgroup.

Splitting Communicators: `MPI_Comm_split`

MPI_Comm_split is the main, high-level tool for building subcommunicators from an existing one.

Conceptually, every process in the original communicator calls:

MPI_Comm_split(old_comm, color, key, &new_comm);

Parameters:

color:

Determines subcommunicator membership.
All processes that pass the same color value end up in the same new communicator.
A process may use MPI_UNDEFINED as color to indicate it does not join any new communicator (returns MPI_COMM_NULL).

key:

Determines the ordering (rank assignment) within that new communicator.
Processes in the same color are sorted by key, giving them ranks 0 to subsize-1.

new_comm:

The resulting communicator for that color.

Example: Split by Even/Odd Rank

Suppose we want:

One communicator for all processes with even ranks in MPI_COMM_WORLD.
Another for all odd ranks.

int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
int color = world_rank % 2; // 0 for even ranks, 1 for odd
MPI_Comm even_odd_comm;
MPI_Comm_split(MPI_COMM_WORLD, color, world_rank, &even_odd_comm);
// Now there are two communicators:
// - color 0: ranks {0, 2, 4, ...} (with new ranks 0..neven-1)
// - color 1: ranks {1, 3, 5, ...} (with new ranks 0..nodd-1)

Within each of these new communicators:

MPI_Comm_size(even_odd_comm, &size);
MPI_Comm_rank(even_odd_comm, &rank);

refer to the size and rank inside that subcommunicator, not in MPI_COMM_WORLD.

Example: Split by Role

Common pattern: assign roles to processes and create a communicators per role.

// For example, rank 0 is "I/O master", others are "compute"
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
int color;
if (world_rank == 0) {
    color = 0;  // IO group
} else {
    color = 1;  // compute group
}
MPI_Comm role_comm;
MPI_Comm_split(MPI_COMM_WORLD, color, world_rank, &role_comm);
// Now you can do collectives independently within IO or compute groups.

This is useful for:

I/O aggregation (only some processes handle filesystem I/O).
Heterogeneous work assignments.
Staging and coupling multi-physics solvers.

Duplicating Communicators: `MPI_Comm_dup`

Different parts of a code (or different libraries) may want to use the same set of processes but avoid interfering with each other’s messages or collectives. MPI_Comm_dup creates a new communicator with:

The same group of processes,
But a distinct communication context.

Conceptually:

MPI_Comm new_comm;
MPI_Comm_dup(old_comm, &new_comm);

Properties:

new_comm has the same size and processes as old_comm.
Ranks in new_comm match ranks in old_comm.
Messages sent on old_comm cannot accidentally match receives on new_comm and vice versa.

Typical use:

Library codes often take a communicator as input and then call MPI_Comm_dup internally.

This isolates library’s internal communications from user’s communications.

Usage pattern in a library (simplified):

void my_library_init(MPI_Comm user_comm, my_lib_handle *h)
{
    MPI_Comm_dup(user_comm, &h->lib_comm);
    // Now library uses h->lib_comm for all MPI calls
}

Using `MPI_COMM_NULL`

Some communicator creation calls may yield MPI_COMM_NULL on some ranks:

MPI_Comm_create:

Processes not in the selected group get MPI_COMM_NULL.

MPI_Comm_split:

Processes that pass color = MPI_UNDEFINED get MPI_COMM_NULL.

Any MPI calls that expect a valid communicator must not be given MPI_COMM_NULL. Always check:

if (new_comm != MPI_COMM_NULL) {
    MPI_Comm_rank(new_comm, &subrank);
    MPI_Comm_size(new_comm, &subsize);
    // safe MPI operations on new_comm
}

This allows you to:

Logically “remove” processes from a computation stage.
Avoid unnecessary participation in collective operations.

Practical Use Cases for Custom Communicators

1. Decomposing the Domain

In structured grid problems, you might:

Create a 2D or 3D Cartesian communicator (see process topologies section).
Use subcommunicators for:

Rows vs columns
Slices in one dimension

Perform collectives only within a line or plane of processes.

Communicators encode the geometry of your process layout, matching the problem geometry.

2. Overlapping Multiple Algorithms

A single MPI job may run multiple algorithms in parallel, each on a subset of processes:

Algorithm A uses ranks 0..p-1.
Algorithm B uses ranks p..P-1.

Separate communicators avoid conflicts when each algorithm calls its own collectives or uses overlapping tags.

3. Resource-Aware Grouping

You can align communicators with hardware resources:

One communicator per node or socket.
One communicator per GPU or NUMA domain.

This supports node-local or socket-local collectives, which may be more efficient (e.g., shared-memory aware operations).

4. Modular Software Design

Large codes composed of multiple modules or libraries often:

Pass communicators between modules.
Duplicate communicators per module to isolate contexts.
Use subcommunicators for separate physics components or solver stages.

Communicators thus support composability: independent pieces of code use MPI safely in the same job.

Communicators and Collectives

Collective operations are always scoped to a communicator:

All processes in that communicator must participate (unless explicitly stated otherwise in MPI standard).
Processes not in that communicator must not call that collective.

This implies:

You cannot “partially” call a collective on MPI_COMM_WORLD with only half the ranks.
If you want a collective over a subset, you must create an appropriate communicator and call the collective on that communicator.

Example pattern:

Build subcommunicator for the relevant subset.
Call collective only on that communicator.

// Suppose subcomm contains only the processes that need to participate
if (subcomm != MPI_COMM_NULL) {
    MPI_Bcast(data, count, MPI_DOUBLE, root_in_subcomm, subcomm);
}

This is central to writing correct and scalable distributed-memory codes.

Lifetime and Cleanup

Communicators are MPI objects that you should free when no longer needed:

MPI_Comm_free(&subcomm);

Notes:

After MPI_Comm_free, the communicator handle becomes invalid.
You must ensure that no MPI operations use it after it is freed.
For dynamically created communicators in long-running jobs, proper cleanup avoids exhausting internal MPI resources.

MPI_COMM_WORLD and certain predefined communicators are managed by MPI and must not be freed.

Summary

A communicator defines:

A group of processes,
A communication context,
And the rank numbering within that group.

MPI_COMM_WORLD is the global communicator, but nontrivial applications almost always define additional communicators.
Groups are building blocks; communicators are what you actually use for communication.
MPI_Comm_split is a powerful way to partition processes into role-based or geometry-based subcommunicators.
MPI_Comm_dup creates a safe, isolated context for libraries and modules without changing membership.
Proper design of communicators enables: