Kahibaro
Discord Login Register

CPUs, cores, and clock speeds

Role of CPUs in HPC

In HPC systems, the CPU (Central Processing Unit) is the primary general-purpose compute engine. While GPUs and other accelerators are increasingly important, nearly every HPC workflow still depends on CPUs for:

For HPC programmers and users, understanding CPUs means understanding three tightly related ideas:

These characteristics directly affect how many tasks your job can run in parallel and how fast each task can execute.

CPU Sockets vs Cores

CPU sockets

A socket (or CPU package) is the physical chip plugged into the motherboard. An HPC node may have:

When you see descriptions such as “2 × 32-core CPUs,” that means the node has:

Schedulers and resource managers typically expose both nodes and cores as resources. The number of sockets matters for memory locality and interconnect topology (which are covered elsewhere), but from the perspective of this chapter, sockets are just containers for cores.

CPU cores

A core is an independent execution unit inside a CPU that can run its own instruction stream. Conceptually:

Important properties of cores:

Modern HPC CPUs may have tens to hundreds of cores per socket. For example:

Hardware Threads and Simultaneous Multithreading (SMT)

Many CPUs support simultaneous multithreading (SMT), known as hyper-threading in Intel’s terminology.

For a CPU with:

The OS will report 64 logical CPUs.

Key points for HPC:

HPC schedulers may let you request:

You need to know whether “64 CPUs per node” in documentation means:

What Clock Speed Means

Clock cycles and frequency

The CPU core executes instructions in discrete time steps called clock cycles. The clock speed or frequency is how many cycles happen per second.

A core with a 3.0 GHz clock runs:

$$
3.0 \times 10^9 \text{ cycles/second}
$$

The (idealized) time per cycle is:

$$
\text{cycle time} = \frac{1}{\text{frequency}}
$$

So at 3.0 GHz:

$$
\text{cycle time} = \frac{1}{3.0 \times 10^9} \approx 0.33 \text{ ns}
$$

Programs are made up of instructions; each instruction takes one or more cycles to complete when executed on the core. In practice:

Base, turbo, and all-core frequencies

Modern CPUs do dynamic frequency scaling:

In HPC:

This means the “3.5 GHz” advertised on a spec sheet does not guarantee 3.5 GHz simultaneously on all cores for a long-running, full-node job.

Performance and frequency

Ignoring other factors, the time to complete a fixed number of instructions scales approximately inversely with frequency:

If:

and the same instruction stream is executed with the same efficiency, then:

$$
\frac{T_1}{T_2} \approx \frac{f_2}{f_1}
$$

For example, a move from 2.5 GHz to 3.0 GHz could ideally give:

$$
\text{speedup} \approx \frac{3.0}{2.5} = 1.2
$$

i.e., about 20% faster. In real workloads, memory and I/O bottlenecks often reduce the actual gain.

Cores, Frequency, and Throughput vs Latency

In HPC, it is useful to distinguish:

Relating this to cores and clock speed:

Examples:

Cluster procurement trade-off (often not your decision, but useful to understand):

Measuring Core and CPU Performance: A Simple Model

A very basic model for per-core floating-point performance is:

$$
\text{FLOP/s per core} \approx f \times \text{FLOP per cycle}
$$

where:

Total theoretical performance of a node (ignoring overheads) is roughly:

$$
\text{FLOP/s per node} \approx (\text{FLOP/s per core}) \times (\text{number of cores})
$$

In practice:

Power, Thermal Limits, and Energy Considerations

Higher clock speeds and more active cores increase:

Modern CPUs manage this via:

For HPC systems:

From a job’s perspective:

Some systems expose frequency controls or power management features to users (within limits), which can be leveraged for energy-aware computing (discussed in sustainability-focused sections).

Practical Implications for HPC Users

Reading node specifications

When you see something like:

you can infer:

This helps you decide:

Choosing core counts for jobs

Common patterns:

Although scheduling and hybrid strategies are addressed in later chapters, you should already be comfortable that:

Not relying blindly on GHz

Clock speed alone is not a reliable performance predictor across different CPU generations or architectures because:

However, within the same CPU family, at similar core counts:

For performance-critical runs, it is common to:

Summary

Views: 17

Comments

Please login to add a comment.

Don't have an account? Register now!