Kahibaro
Discord Login Register

Registers

What Registers Are in the Memory Hierarchy

Registers are the fastest storage locations in a CPU and sit at the very top of the memory hierarchy. They are:

In the hierarchy, you can think of:

$$\text{Registers} \ll \text{L1 cache} \ll \text{L2/L3 cache} \ll \text{RAM} \ll \text{disk}$$

Accessing a register is effectively “free” compared to any other memory access and does not induce cache misses.

Types of Registers (Conceptual)

Different CPU architectures define different exact register sets, but conceptually you will encounter:

From a high-level HPC perspective, the distinction that really matters is:

Registers and Instruction Execution

For a typical CPU instruction, operands must be in registers:

  1. Values are loaded from memory (through caches) into registers.
  2. Instructions perform computations on register values.
  3. Results may be stored back from registers to memory.

Example (conceptual, not exact assembly):

; Load from memory to registers
LOAD R1, [A]      ; R1 = A
LOAD R2, [B]      ; R2 = B
; Compute in registers
ADD  R3, R1, R2   ; R3 = R1 + R2
; Store result to memory
STORE [C], R3     ; C = R3

All arithmetic and logical operations happen in registers; memory is only read/written via load/store.

Registers and Compiler Optimization

You don’t usually manipulate registers directly in high-level languages; the compiler decides what lives in registers. However, your coding style and compiler options strongly influence register usage:

When a compiler cannot keep all needed values in registers, it performs a register spill: some values are temporarily stored to memory (typically the stack) and reloaded later. Spilling is much slower than staying entirely in registers.

For HPC, a key performance idea is:
Minimize register spills in hot (performance-critical) code sections.

Register Pressure

Register pressure is the demand for more registers than are physically available at a given point in the code.

High register pressure leads to:

Factors that increase register pressure:

Typical ways to help the compiler reduce register pressure in HPC code:

Registers and Vectorization

Modern HPC CPUs have vector/SIMD registers that can hold multiple data elements:

Vector instructions operate on the entire register at once:

For a simple loop like:

for (int i = 0; i < N; i++) {
    C[i] = A[i] + B[i];
}

The compiler can:

  1. Load multiple A[i] and B[i] values into vector registers.
  2. Perform vector additions using SIMD instructions.
  3. Store the resulting vector register back to memory.

Effective use of vector registers is central to HPC performance; how this is exploited is covered more deeply under SIMD/vectorization concepts, but here the key point is that vector registers are just larger, specialized registers that enable parallel operations on data.

Registers, Function Calls, and the Stack

Function calls influence register usage:

This convention:

In performance-critical HPC kernels:

Registers and Different Data Types

Different data types often occupy different registers or portions of a register set:

From an HPC perspective:

Practical Signals of Register Issues

Even without reading assembly, you can sometimes infer register-related problems from:

In small experiments, you might observe:

Summary: Registers in HPC Context

Key points specific to registers in the memory hierarchy:

Understanding how registers fit into the hierarchy helps you reason about why seemingly small code changes or compiler flags can dramatically impact performance in high-performance computing.

Views: 15

Comments

Please login to add a comment.

Don't have an account? Register now!