Table of Contents
Introduction
In distributed memory programming, the arrangement of processes is just as important as the number of processes. Many parallel algorithms assume that processes are arranged in a grid, a line, a ring, or some more complex pattern. This logical arrangement is called a process topology.
A process topology is independent of the physical arrangement of nodes and network links. It is a way for the program to express that “these processes are neighbors” even if they might be running on different nodes. Once a topology is defined, communication patterns can be described much more naturally, which simplifies code and can improve performance.
This chapter focuses on the idea of process topologies in distributed memory programs and, in particular, how they are handled in MPI. Concepts such as basic MPI processes and point to point communication are assumed to be known from earlier sections.
Motivation for Process Topologies
Many scientific and engineering problems are naturally expressed on grids, meshes, or graphs. For example, discretizing a 2D domain for a partial differential equation leads to a grid of points. If the grid is decomposed among processes, each process owns a subregion and communicates with neighboring subregions.
Without a topology, every process is identified by a single integer rank in MPI_COMM_WORLD. To locate neighbors you must perform arithmetic on ranks and maintain your own mapping between problem coordinates and process ranks. This quickly becomes tedious and error prone when you work with multidimensional decompositions or nontrivial neighborhood relationships.
A process topology lets you express this structure explicitly. Processes can be organized in a virtual grid or graph, and MPI can provide convenient queries such as “who is my neighbor in the +x direction” and “what are my coordinates in the 2D process grid”. This does not change the underlying physical network, but it gives the runtime additional information which can be used to optimize communication.
A process topology is a logical arrangement of MPI processes that reflects the communication pattern of the algorithm, such as a 1D line, 2D grid, 3D grid, or general graph of neighbors.
Virtual versus Physical Topologies
It is important to distinguish between virtual and physical topologies.
A virtual topology is defined at the MPI level. It is a pure software abstraction. You specify how processes are arranged and how they are connected in a logical sense, not how they are placed on actual hardware nodes or network switches.
The physical topology of the cluster includes the nodes, their interconnect links, and the routing between them. The job scheduler and MPI implementation decide which processes run on which nodes. Some MPI libraries can use information about the physical topology to map a virtual topology more intelligently, but as a programmer you typically describe only the virtual structure that matches your algorithm.
In simple programs, every process is equal and any process may communicate with any other. In more structured applications, the logical relationships are more constrained. Process topologies are especially useful when the application communication pattern is regular and neighbor based. For irregular or dynamically changing patterns, topology mechanisms are still helpful, but graph based features become more relevant than simple multidimensional grids.
Topologies and Communicators
In MPI, a topology is always associated with a communicator. MPI_COMM_WORLD has a trivial topology, there is no predefined structure beyond linear ranks. When you create a new topology, you also obtain a new communicator that encapsulates the same group of processes, or a subset, but with a topology attached.
Conceptually, you can think of this as follows. You start from a communicator that contains a group of processes. You call a constructor function that defines a topology. MPI returns a new communicator. The original communicator is unaffected and remains usable. The new communicator adds extra services, such as functions to query neighbor ranks or coordinates.
You are free to define several different topologies over the same set of processes. For example, one communicator may treat them as a 2D grid for one part of the algorithm, and another communicator may treat them as a ring for another part. Each communicator is independent, and you choose the one that best matches the communication needs at each stage.
Cartesian Process Topologies
The most common type of process topology in MPI is the Cartesian topology. It organizes processes in a regular grid of 1, 2, or more dimensions. Cartesian topologies are especially useful for applications that use structured meshes or stencil computations.
Defining a Cartesian grid
To define a Cartesian grid, you decide:
The number of dimensions of the grid, for example 1D, 2D, or 3D.
The number of processes in each dimension. The product of these numbers must match the total number of processes in the communicator you use as a base.
Whether the grid is periodic in each dimension or has boundaries.
If you use MPI, this information is passed to the Cartesian constructor, which returns a communicator with the grid information attached. Each process has a rank in this new communicator as usual, but it also has coordinates in the grid. For a 2D grid, these coordinates are typically $(i, j)$.
The mapping from ranks to coordinates is controlled by MPI. You can either let MPI choose a mapping, which may allow it to optimize placement, or you can control it more directly by specifying how ranks should be ordered.
Coordinates and ranks
Once the Cartesian topology exists, each process can query its coordinates and rank conversions using dedicated functions. Given a process rank in the Cartesian communicator, you can obtain its coordinates. Similarly, given coordinates, you can ask MPI to tell you the rank of the process that owns that grid position.
This allows you to work mostly in terms of coordinates that match the problem geometry instead of raw integer ranks. For example, in a 2D grid decomposition, you can think in terms of (row, column) indices. Your local indices within the subdomain and your process coordinates within the process grid become directly related.
Key mapping operations in Cartesian topologies
- Rank to coordinates: obtain a process coordinate vector from its rank in the Cartesian communicator.
- Coordinates to rank: obtain the rank of the process that resides at a given coordinate vector.
Periodic boundaries
Cartesian topologies can be specified as periodic or non periodic along each dimension. Periodic means that processes on one edge of the grid consider processes on the opposite edge as neighbors. This is useful for simulations with domain wrap around, such as some fluid dynamics or plasma models that use periodic boundary conditions.
If a dimension is non periodic, processes at the boundaries have no neighbor outside the grid in that direction. When you query neighbors, MPI signals that there is no neighbor, typically with a special value such as MPI_PROC_NULL. This behavior lets you handle boundaries in a straightforward way without extra bookkeeping.
Neighbor Relationships in Cartesian Grids
The main benefit of a Cartesian topology is that it makes neighbor relationships simple and explicit. Instead of manually computing neighbor ranks, you can ask MPI to give you the rank of your neighbor in a particular dimension and direction.
In a typical 2D stencil computation, each process needs data from its north, south, east, and west neighbors. With a Cartesian communicator, each process identifies its coordinates, say $(i, j)$, then asks MPI for the neighbors in the positive and negative directions of each dimension. The resulting neighbor ranks can be used in point to point communication, often in a symmetric pattern where each process sends to and receives from its neighbors.
This approach reduces the risk of errors in rank arithmetic. For example, in a 2D process grid with Px processes in the horizontal direction and Py in the vertical direction, manually computing neighbors usually involves expressions like:
$$\text{east\_rank} = (\text{my\_rank} + 1) \bmod P_x + P_x \times \text{row}$$
With a Cartesian topology, you can work at a higher level and let MPI handle such details. The code becomes easier to read and maintain, and algorithmic changes such as moving from a 2D to a 3D decomposition affect fewer parts of the program.
For higher order stencils that require diagonal neighbors or more distant neighbors, you can compute the target coordinates directly, for example $(i+1, j+1)$ for a northeast neighbor, then transform them to ranks using the mapping function. This approach keeps the code close to the mathematical description of the algorithm.
Subgrids and Cartesian Subcommunicators
Many algorithms use hierarchical decompositions. For example, you might have a global 3D grid of processes, but you want to operate only on 2D planes or 1D lines at certain stages. Cartesian topologies include support for this. You can create subcommunicators that are restricted to lower dimensional slices of the full grid.
Conceptually, you start from a full Cartesian communicator. You specify which dimensions you want to keep and which to drop. MPI then constructs a new communicator whose processes correspond to those that share the same coordinates in the dropped dimensions. Inside this subcommunicator, processes are arranged in a lower dimensional Cartesian topology.
This mechanism allows you to match the structure of different parts of your algorithm. For example, in a 3D domain decomposition you might do global operations along each axis separately. By creating subcommunicators for lines or planes, you can express these operations more directly and often more efficiently.
Subcommunicators also let you separate concerns. You can write code for 1D operations that work only with rank and neighbors in that dimension, without worrying about the global 3D context.
Graph and Distributed Graph Topologies
Not all communication patterns are regular grids. Some applications rely on arbitrary neighbor relationships that correspond to unstructured meshes or generic graphs. For such situations, MPI provides graph based topologies.
Basic graph topologies
A graph topology associates each process with a set of neighbor processes. You define the graph by listing, for each process, which other processes it is connected to. MPI then builds a topology that allows you to query neighbors easily. Unlike Cartesian topologies, there is no notion of geometric coordinates, only connectivity.
Graph topologies are suitable when you know the communication graph in advance and it does not change during the program. They are particularly useful in simulations using unstructured meshes, where each cell may have a different number of neighbors.
Distributed graph topologies
For large scale applications, it is not practical for every process to know the entire communication graph. Distributed graph topologies allow each process to describe only its own neighbors. MPI collects this local information and constructs an internal representation of the graph topology.
The advantage of distributed graph topologies is scalability. Each process specifies outgoing edges to its neighbors, and MPI can determine the corresponding incoming relationships if needed. This approach matches the typical data distribution in many parallel codes, where each process knows its local part of the mesh and which neighboring subdomains it must communicate with.
A graph topology represents processes as nodes in a graph and their communication relationships as edges. A distributed graph topology allows each process to specify only its local edges, which is scalable for large unstructured problems.
Topologies and Neighborhood Collectives
Once a topology is defined, MPI can offer optimized collective operations that are aware of neighbor relationships. These are called neighborhood collectives. They generalize point to point neighbor exchanges but are organized as collective operations over the topology communicator.
In a Cartesian or graph topology, neighborhood collectives allow each process to simultaneously send data to and receive data from its neighbors. The communication pattern is fixed by the topology and does not need to be specified explicitly at each call. This is particularly convenient for codes that perform regular halo exchanges in stencil computations or unstructured mesh updates.
Neighborhood collectives can be more efficient than manually written loops of point to point operations, because the MPI implementation can schedule and combine data transfers based on its knowledge of the full pattern. The programmer only provides the data buffers and message sizes, while the neighbor relationships are implied by the topology.
Performance Implications of Using Topologies
Process topologies are primarily a software abstraction, but they can influence performance in several ways.
First, they simplify the expression of regular communication patterns. Code that directly reflects the structure of the problem is easier to inspect and tune. Neighbor exchanges, halo updates, and grid sweeps become clear operations on process coordinates and neighbors. This can reduce logic errors that might otherwise cause unnecessary communication or synchronization.
Second, topologies give the MPI implementation additional context that it may use to choose better communication paths. For instance, if the MPI library has information about the physical network and the rank to node mapping, it can try to place neighboring processes close to each other in the hardware, for example on the same node or on nodes connected by low latency links. This is not guaranteed, but the topology information enables such optimizations.
Third, neighborhood collectives can reduce overhead by combining many small point to point messages into larger operations, or by overlapping communication in a way that would be difficult to manage manually. In large runs, where communication costs dominate, such improvements can be significant.
It is important to remember that topologies do not magically fix all performance problems. Poor data decompositions, unbalanced workloads, and excessive synchronization can still limit scalability. Topologies are one tool among many for organizing distributed memory programs. When used together with good domain decomposition and load balancing, they help express efficient communication patterns in a clear and maintainable way.
Designing Process Topologies for Applications
When designing a process topology for a specific application, the starting point is usually the structure of the data and the operations applied to it.
If the problem is defined on a regular grid, such as a structured mesh for solving PDEs, a Cartesian topology is usually the natural choice. The number of processes in each dimension is often chosen to match the decomposition of the grid into subdomains. For a $N_x \times N_y$ grid and a $P_x \times P_y$ process grid, each process may own a local block of size approximately $N_x / P_x$ by $N_y / P_y$.
For more irregular domains, such as unstructured finite element meshes, a graph or distributed graph topology is more appropriate. In such cases, partitioning tools can be used to split the mesh into subdomains and to minimize the number of edges between partitions. The resulting communication graph between subdomains can be used directly as the process topology.
You should also consider algorithmic phases. Some phases may use one topology, for example a 3D Cartesian grid, while other phases might work more efficiently with subgrids or even with a flat communicator. Cartesian subcommunicators, as mentioned earlier, are particularly useful here.
Finally, portability should be considered. The same topology definition should be usable across systems with different numbers of nodes and different interconnects. Topologies defined in terms of communicators and dimensions, rather than hard coded rank assumptions, make it easier to adapt the program to different scales.
Summary
Process topologies provide a structured way to describe how processes relate to one another in distributed memory programs. They capture the logical geometry or graph structure of an application and connect it to MPI communicators.
Cartesian topologies support regular multidimensional grids with options for periodicity and easy mapping between ranks and coordinates. Graph and distributed graph topologies handle more general, irregular neighbor relationships that arise in unstructured meshes and complex communication patterns. Once a topology is attached to a communicator, neighbor queries and neighborhood collective operations become available, which simplifies code and can enable performance optimizations.
The choice and design of a process topology should reflect the underlying data decomposition and the communication requirements of the algorithm. Used thoughtfully, topologies are a powerful tool for writing clear, scalable distributed memory codes.