18.1 Energy consumption of HPC systems

Table of Contents

Power as a First-Class Concern in HPC

High performance computing systems consume significant amounts of energy. Modern top-tier supercomputers draw several megawatts of power continuously, comparable to a small town. Understanding where this energy goes, how it is measured, and what controls it gives you the foundation to use HPC resources responsibly and to design more energy aware applications.

In this chapter the focus is not on general ethics or green computing concepts, which are covered at the parent level, but specifically on the technical aspects of energy consumption in HPC systems and what that implies for users and developers.

Power, Energy, and Efficiency

Power and energy are related but distinct physical quantities. Power is the rate at which energy is used. Energy is the total amount consumed over time. For HPC, you can think of power as “how big the electricity hose is” and energy as “how much water came through it during a run.”

Formally, power $P$ is measured in watts (W) and energy $E$ in joules (J), with the relationship
$$
E = P \times t
$$
where $t$ is time in seconds.

In practice, data centers talk about energy in kilowatt hours (kWh), which relate to joules by
$$
1\ \text{kWh} = 3.6 \times 10^6\ \text{J}.
$$

For HPC applications, two forms of efficiency are particularly important. Performance per watt is defined as
$$
\text{efficiency} = \frac{\text{useful work}}{\text{power}},
$$
where “useful work” might be measured in floating point operations per second (FLOP/s), time steps per second, or some problem-specific measure. A closely related quantity is energy to solution, defined as
$$
E_{\text{solution}} = P_{\text{avg}} \times T_{\text{run}},
$$
where $P_{\text{avg}}$ is the average power draw of the resources used and $T_{\text{run}}$ is the runtime of the job.

The key quantity for sustainable HPC is usually energy to solution, not peak performance alone. A faster job that draws much more power can still be worse overall if its total energy use is higher.

Where the Energy Goes in an HPC System

An HPC facility is not just CPUs and GPUs. Energy is consumed all along the path from the power grid to the computation itself. For a user, it is helpful to recognize the main components contributing to the power bill.

Inside each node, the processors, accelerators, and memory consume a large fraction of the power budget. CPUs and GPUs draw dynamic power when switching transistors perform operations and also static or leakage power simply by being powered on. Memory consumes energy both when accessed and just to retain data.

Beyond the node, the interconnect that links nodes, such as InfiniBand or high performance Ethernet, consumes energy for moving data. Parallel file systems and storage arrays draw power to spin disks or keep solid state drives active and to run storage controllers. Networking equipment, routers, and switches within the cluster fabric and between cluster and storage are additional consumers.

The facility itself has overhead in the form of cooling systems and power delivery infrastructure. Chillers, pumps, fans, and air handlers remove heat from the computing equipment, while uninterruptible power supplies and transformers condition and distribute electricity. These components can consume substantial energy that is not directly doing computation but is necessary to keep the systems operating safely.

This leads to a distinction between IT power, which is the power used by the computing and networking equipment, and facility power, which includes IT power plus cooling, power conversion, lighting, and other overheads. Energy efficiency improvements can target both sides.

Power Usage Effectiveness and System-Level Metrics

Data centers use simple metrics to describe how efficiently they turn incoming electricity into useful computing work. One widely used metric is Power Usage Effectiveness, abbreviated PUE. It is defined as
$$
\text{PUE} = \frac{\text{total facility power}}{\text{IT equipment power}}.
$$

In this context, total facility power means all power measured at the building boundary, while IT equipment power is the power drawn by servers, storage, and networking devices. An ideal value of PUE would be $1.0$, which would mean that every watt consumed by the facility goes directly into the IT equipment with no overhead. In reality, there is always some overhead, typically from cooling and power distribution.

A lower PUE is better, because it means less overhead per watt of IT power. Facilities strive to keep PUE as close to 1 as possible, but values above 1.2 or 1.3 are still common.

For supercomputers, a common measure that combines performance with power is the energy efficiency reported by the Green500 list. This list ranks systems by the number of floating point operations per second achieved per watt of power consumed, often expressed as GFLOP/s per watt. This metric highlights machines that deliver high performance while keeping power use under control.

From a programmer’s perspective, it is also useful to think about application level metrics. These include time to solution at fixed power, energy per iteration of a simulation, or joules per processed data element in data analytics. There is a growing trend to include energy efficiency considerations alongside runtime in performance assessments of HPC codes.

Trends in Power Consumption of Supercomputers

Historically, supercomputers have pushed toward ever higher performance, and power has risen in parallel. Without efficiency improvements, a naïve extrapolation of past trends would have led to exascale systems requiring hundreds of megawatts, clearly impractical. Instead, architectural and technological changes have reduced the energy required per operation.

Modern processors use smaller process technologies, advanced power gating, and dynamic control of clock frequency and voltage to reduce energy usage. Accelerators such as GPUs provide many more floating point operations per unit of power than traditional CPUs for workloads that match their execution model.

Even with these advances, power remains a strict design constraint. Many large HPC systems have a power budget that cannot be exceeded, often in the range of a few megawatts. This leads to trade-offs between the number of nodes, their individual performance, and operating points such as clock frequency. It also creates a context where energy efficiency improvements in software can have large aggregate effects.

For users, this means that energy-aware computing is not just a matter for hardware designers and facility engineers. Application choices influence how efficiently the system operates within its fixed power envelope. Poorly optimized codes can waste the limited power on idle cycles and inefficient memory accesses, while well-tuned software can help sustain higher effective performance within the same power budget.

Components of Node-Level Power Consumption

At the level of a single compute node, power consumption is typically divided among CPU or accelerator, memory, and the rest of the system. The CPU package may include multiple cores and possibly integrated GPUs or other units. Each contributes to the node’s total draw depending on utilization.

Dynamic power of a processor is approximately proportional to the square of the supply voltage and linearly proportional to the clock frequency. A simplified relationship often cited is
$$
P_{\text{dynamic}} \propto C V^2 f,
$$
where $C$ represents effective capacitance, $V$ the supply voltage, and $f$ the clock frequency. As frequency or voltage increase, dynamic power grows and heat output rises, which can require more energy for cooling. This formula captures why high clock speeds can be expensive in terms of power.

Memory power depends on both capacity and activity. Higher memory bandwidth usage and frequent access patterns tend to increase energy. Storage attached to the node consumes additional power if used heavily, especially for spinning disks, although in many HPC nodes the main storage is in shared systems rather than local disks.

Other contributors include network interfaces, which draw more power when moving more data, and system components such as fans and voltage regulators. At large scale, even modest improvements in the efficiency of these parts can lead to sizeable reductions in overall energy use.

Measuring and Monitoring Power in HPC

Understanding energy consumption requires measurements. Modern HPC systems provide several layers of measurement and monitoring, some built into hardware, others provided by the facility.

At the hardware level, many processors expose power usage through on-chip sensors. For example, certain CPU families provide energy counters that can be accessed through specialized software interfaces. Similar capabilities exist for many GPUs, which can report instantaneous power draw and energy used over a period. These readings are not perfect but give a practical way to estimate the energy portion attributable to computation on a specific component.

At the node level, some systems have power meters on the power supply or at the rack power distribution units. These measurements can be aggregated to see how a group of nodes behaves. System level monitoring frameworks may log power data over time and correlate it with jobs or users.

From a user perspective, there are two practical outcomes. In some environments, job schedulers expose power usage information, such as average node power while a job runs or total energy used by a job allocation. In other environments, users may need to call vendor-specific tools or libraries from within their applications to record power or energy counters.

Energy aware computing starts with measurement. Without at least approximate power or energy data for your runs, you cannot reliably evaluate energy efficiency improvements.

Power, Performance, and the Runtime Trade-off

Performance tuning and energy optimization are closely linked but not always perfectly aligned. A change that speeds up an application can reduce its energy to solution by shortening runtime, but it can also increase power so much that total energy rises. This creates a performance versus power trade-off.

You can visualize the space of options as a curve where different configurations produce different combinations of runtime and power. High frequency settings may yield short runtimes but high average power, while more moderate clock frequencies can reduce power significantly with only a modest increase in runtime. Some applications reach a point where increased clock speed provides diminishing returns because the code is limited by memory bandwidth or latency rather than raw compute throughput.

In such cases, operating CPUs and GPUs at maximum performance states can waste energy since the extra power does not translate into proportional performance gains. A slightly slower but significantly less power hungry configuration can lead to a lower energy to solution. Similar considerations apply to core counts per node, use of vector units, and degree of multithreading.

For parallel applications that scale across many nodes, achieving good parallel efficiency also matters. When processes or threads are idle due to load imbalance or poor communication patterns, hardware continues to consume power without advancing the computation. Improving load balance and communication efficiency can therefore lower energy use by reducing wasted cycles.

Energy Consumption and Job Scheduling

At the system level, power is a shared and finite resource. Job schedulers are primarily designed to allocate cores, memory, and time, but power and energy constraints are becoming more relevant in scheduling decisions.

Some sites enforce a system-wide power cap, which is the maximum power the system can draw from the grid. Within this cap, the scheduler must decide which nodes to power up and which jobs to run concurrently. It may limit the number of jobs at peak frequency or adjust the distribution of work to avoid surpassing the power budget.

More sophisticated schedulers can incorporate power and energy information in their policies. They might prioritize energy efficient jobs when the system approaches its power limit, or they might schedule power intensive batches at times when electricity is cheaper or when cooling conditions are favorable. There is also research into schedulers that let users specify energy or power constraints for their jobs in addition to traditional resource requests.

As a user, this means that your application’s energy profile can influence how it fits into the global system. Highly energy intensive codes may face constraints on when and how they run, while codes designed with efficiency in mind may be easier to schedule under tight power budgets. In environments that report job energy usage, this data can also be connected to accounting or reporting mechanisms that highlight energy costs per project or per workflow.

Application-Level Patterns with Energy Impact

Several common application characteristics have a direct impact on energy consumption. The most obvious is resource utilization. If code keeps CPU or GPU units busy with useful computations and keeps memory accesses efficient, then most of the power contributes to progress. If code leaves units idle or frequently stalls on memory or I/O, then a significant portion of the power is effectively wasted.

Another important factor is data movement. Moving data is generally more energy intensive than doing arithmetic on data already in registers or close caches. Applications that move large volumes of data between memory and processors or across the network tend to have higher energy per useful operation. Organizing computation to reuse data locally and reduce communication can reduce both runtime and energy.

Parallelization patterns can also influence energy. Very fine-grained parallelism with frequent synchronization can lead to many small idle periods where resources consume power without doing work. Coarser, well-balanced parallelism with fewer synchronization barriers typically uses energy more effectively.

Even simple decisions such as how often to write checkpoints, how frequently to log output, or how detailed diagnostic data should be can change I/O demands and thus energy usage. Balancing these operational needs against energy considerations is part of building sustainable HPC workflows.

Facility-Level Strategies and the User’s Role

HPC centers invest heavily in energy efficient infrastructure. They may employ advanced cooling techniques such as warm water cooling or outside air economizers to lower the energy cost of removing heat from equipment. They may choose hardware platforms for their performance per watt characteristics and optimize layout to minimize cooling hotspots.

Although these strategies are under the control of system operators, users influence how effectively they perform. If workloads consistently operate hardware at thermal limits because they are inefficient or misconfigured, cooling systems must work harder and consume more energy. If jobs are structured in ways that cause frequent underutilization and idle hardware, then the fixed overhead of cooling and power distribution represents a larger share of total energy.

Users can also participate in site policies that aim at energy awareness. For example, some centers provide guidance on preferred job sizes, recommend running highly parallel jobs during certain time windows, or encourage configurations that are known to be energy efficient for specific architectures.

Operating an HPC system sustainably is a shared responsibility between facility, system software, and users. Code design and job configuration choices have real consequences for the total energy footprint.

From Awareness to Practice

For beginners, the first step toward energy aware HPC use is to integrate energy thinking into normal performance considerations. When measuring performance, complement runtime and scalability measurements with available power or energy data. When optimizing for speed, check whether changes also improve or degrade energy to solution.

Whenever your environment exposes energy or power metrics at job level, pay attention to them. Compare different node counts or thread configurations not only in time to solution but also in total energy. Over time, this will develop an intuition for which patterns tend to be energy efficient on the systems you use.

At a higher level, coordinate with your local HPC center. Many facilities publish guidance on energy related best practices specific to their hardware and policies. Aligning your workflows with this guidance is one of the most direct actions you can take to reduce the energy impact of your computations while still achieving your scientific or engineering goals.

Comments

Please login to add a comment.

Don't have an account? Register now!