Kahibaro
Discord Login Register

Parallel I/O concepts

Why Parallel I/O Matters

As applications scale to thousands of processes and huge datasets, input and output can become the dominant cost in a run. Parallel I/O is about letting multiple processes or threads read and write data concurrently while preserving correctness and achieving higher aggregate bandwidth.

At a high level, parallel I/O aims to:

This chapter focuses on the concepts you need to understand parallel I/O patterns and the APIs/building blocks you’ll see on HPC systems.

Parallel I/O vs. Serial I/O

In a serial (or “master-only”) I/O pattern:

Issues:

In parallel I/O:

Parallel I/O does not necessarily mean “many files”; you can have:

Both are forms of parallel I/O; which is better depends on the filesystem, tools, and workflow.

Layers of Parallel I/O

Conceptually, parallel I/O is usually organized in layers:

  1. Application-level I/O
    • You call high-level I/O routines (e.g., write an array, read a variable).
    • Examples: HDF5, NetCDF, ADIOS.
  2. Parallel I/O library
    • Implements collective and non-collective operations across processes.
    • Often built on top of MPI-IO.
  3. MPI-IO
    • Standard MPI extension for parallel file I/O.
    • Provides APIs like MPI_File_read_at_all, MPI_File_write_all, etc.
    • Knows about communicators, datatypes, and views.
  4. POSIX / filesystem interface
    • Basic open, read, write, lseek, close.
    • Interacts with the parallel filesystem (e.g., Lustre, GPFS).
  5. Parallel filesystem / storage hardware
    • Distributes data across multiple devices and servers.
    • Provides bandwidth and capacity.

When you design parallel I/O, you’re choosing how you use these layers: direct POSIX from multiple ranks, MPI-IO, or a higher-level library.

Models of Parallel File Access

Shared-File vs. File-Per-Process

Two common patterns are:

1. Shared-file model

2. File-per-process model

Many production codes use hybrid patterns:

Independent vs. Collective I/O

Parallel I/O operations can be:

Independent I/O

Collective I/O

Choosing collective I/O is mainly about performance and sometimes robustness; the semantics (what ends up in the file) can be the same.

Data Partitioning and File Views

To do parallel I/O correctly and efficiently, each process must know:

This mapping is at the core of parallel I/O design.

Decomposition in Memory

In parallel programs, data structures are typically decomposed across processes, for example:

This partitioning is usually already decided in the computation phase. Parallel I/O needs a file layout that reflects or at least is compatible with that decomposition.

Logical File Layout

The file typically stores a global view of the data (e.g., a full 3D grid). Processes should be able to:

A file view (in MPI-IO terminology):

Conceptually, a file view lets you say: “For rank r, this is the slice of the file representing my sub-domain,” even if that slice is strided or irregular.

Access Patterns and Their Impact

The pattern with which processes access the file strongly influences performance.

Contiguous vs. Non-Contiguous Access

When possible, design file layouts where each process can perform few, large, contiguous operations.

Regular vs. Irregular Patterns

If your computation uses irregular decomposition, consider:

Shared-File Coordination and Consistency

With multiple processes writing to one file, you must avoid:

Concepts relevant to coordination:

File Offsets and Atomicity

Each process can:

Atomicity refers to the guarantee that:

Consistency and Visibility

Because of caching and buffering:

The typical pattern:

The Role of MPI-IO (Conceptually)

Without going into full API details, MPI-IO introduces several key ideas:

From a conceptual standpoint, MPI-IO provides the primitives to:

High-Level Parallel File Formats

Many scientific codes do not use MPI-IO directly but rely on parallel-aware libraries and formats. At a conceptual level, these provide:

These libraries typically:

Understanding the concepts of decomposition, file views, and collective I/O helps you use such libraries more effectively.

Typical Parallel I/O Patterns

A few common usage patterns in HPC codes:

  1. Checkpoint/Restart I/O
    • All processes periodically write their state to disk.
    • Usually a parallel write to one or a small number of files.
    • Emphasis on robustness and performance.
  2. Time-stepped output
    • At each time step, processes write out fields or snapshots.
    • May use one file per time step, or a multi-time-step file.
  3. Parallel analysis / post-processing
    • Separate parallel job reads simulation outputs in parallel.
    • Can use a different decomposition than the original code.

In each case, your choices about:

Performance Considerations (Conceptual)

While detailed tuning belongs elsewhere, conceptually:

Parallel I/O design is about balancing:

Summary of Core Concepts

Key ideas to keep in mind for parallel I/O:

These concepts form the foundation for using specific parallel filesystems, libraries, and optimizations in practice.

Views: 14

Comments

Please login to add a comment.

Don't have an account? Register now!