Kahibaro
Discord Login Register

Shared memory systems

What “Shared Memory” Means in HPC

In a shared memory system, multiple processing units (cores, sometimes multiple CPUs) can directly access the same physical memory space. Conceptually, all threads see a single, unified address space:

This is a programming model as well as a hardware organization. When we say “shared memory system” in HPC, we usually mean a machine whose hardware and operating system present a single coherent memory address space to all cores.

Typical shared-memory machines in HPC include:

Hardware Organization of Shared Memory Systems

Symmetric Multiprocessing (SMP)

SMP systems have:

Each processor connects to the same memory subsystem, so the time to access any memory location is approximately the same. This is often called Uniform Memory Access (UMA).

SMP characteristics relevant to HPC:

Non-Uniform Memory Access (NUMA)

Modern multi-socket HPC nodes are usually NUMA systems. Memory is physically divided into regions (sometimes called “memory domains” or “nodes”), each associated with a CPU socket:

Key points:

You will often see this described as cache-coherent NUMA (ccNUMA), because the system also maintains cache coherence across sockets (next section).

Cache Coherence in Shared Memory Systems

Each core typically has its own private caches (L1, often L2) and may share higher-level caches (L3). Since data in memory can be cached on multiple cores, the hardware must ensure that:

This is handled by cache coherence protocols at the hardware level (e.g., MESI and its variants). While you usually do not program these protocols directly, they have performance implications in HPC:

On a functional level, cache coherence is what makes shared memory programming seem simple: you just read and write variables, and all cores eventually agree on their values. On a performance level, it is a key source of overhead when scaling to many cores.

Types and Scales of Shared Memory Systems

Shared memory systems exist at different scales within HPC infrastructure:

Small-Scale Shared Memory: Single Node

Most HPC cluster nodes are themselves shared memory systems:

From a programmer’s perspective:

Large-Scale Shared Memory: Big SMP Servers

Some HPC centers host large shared-memory machines, sometimes called “fat nodes” or “big iron”:

These systems are useful for:

However,:

Shared Memory in the Context of HPC Clusters

Even though large clusters are primarily distributed-memory systems (multiple nodes connected by a network), each individual node is usually a shared-memory system:

This leads to:

For this chapter, the key takeaway is that nodes themselves are shared memory systems, and their internal structure influences how you exploit them effectively.

Advantages of Shared Memory Systems for HPC

Some benefits relevant to HPC workloads:

These properties make shared-memory systems particularly attractive for:

Limitations and Scalability Considerations

Shared memory systems do not scale indefinitely. Main limitations in the HPC context:

Memory Bandwidth and Contention

Coherence Overhead

NUMA Effects

Practical Size Limits

Physical and architectural constraints make it difficult to build a single, coherent shared-memory system beyond a certain size:

For these reasons, HPC systems are typically clusters of shared-memory nodes, not one monolithic global shared-memory machine.

NUMA-Aware Usage Patterns (Conceptual)

Without going into specific tools or commands, there are general patterns for using NUMA-based shared memory systems effectively:

These techniques aim to:

You do not need to know the exact commands or APIs here; the main idea is that on shared memory systems, where in memory your data lives matters for performance, not just correctness.

Typical Use Cases for Shared Memory Systems in HPC

Within an HPC environment, shared-memory systems are particularly suitable for:

Shared memory is therefore a cornerstone of node-level performance in HPC clusters, even when applications ultimately run across many nodes.

Practical Implications for Beginners

When you log in and run on an HPC cluster:

Later chapters on shared-memory programming, hybrid models, and performance optimization will build on this understanding of shared memory systems and their role inside HPC clusters.

Views: 15

Comments

Please login to add a comment.

Don't have an account? Register now!