Table of Contents
Role of Compilers in HPC
High-performance computing depends heavily on compilers. The same source code can run several times faster (or slower) depending on:
- Which compiler you use
- Which version you use
- Which options you pass (covered in the “Compiler optimization flags” chapter)
“Common HPC compilers” are those that are widely available on clusters, tested with scientific codes, and provide good performance on HPC hardware.
This chapter focuses on:
- What makes a compiler “HPC‑oriented”
- The main compiler families you’ll encounter
- How they are typically used on HPC systems
- How to think about choosing between them
Details of each specific compiler family (GCC, Intel oneAPI, LLVM) come in the next subchapters; here we compare them at a higher level.
What Makes a Compiler Suitable for HPC
HPC compilers are not just “C/C++/Fortran translators.” They usually provide:
- Aggressive optimization capabilities
- Vectorization (SIMD) for CPUs and, sometimes, GPUs
- Loop transformations for better cache and memory use
- Interprocedural optimization (across multiple source files)
- Support for HPC languages and standards
- C, C++, Fortran (often multiple Fortran standards)
- Parallel programming models: OpenMP, MPI (via libraries), CUDA/HIP/OpenACC support depending on the vendor
- Good diagnostics and tooling
- Clear warnings and errors, sometimes static analysis features
- Integration with debuggers and profilers used on clusters
- Architecture-specific tuning
- CPU microarchitecture flags (e.g. AVX2, AVX‑512, ARM SVE)
- Vendor‑specific math libraries and runtime optimizations
- Stability and long‑term support
- Many scientific codes are large and old; HPC compilers must be robust on a wide range of legacy and modern codes.
The Main Families of HPC Compilers
Most HPC systems will offer multiple compilers side by side. The most common families are:
- GCC (GNU Compiler Collection)
- Intel oneAPI compilers
- LLVM-based compilers (e.g. Clang/Flang, AOCC, NVHPC front‑ends)
You’ll typically see them exposed via environment modules with names along the lines of:
gcc/12.3.0intel-oneapi/2024.0llvm/17.0or vendor-specific variants
The next subchapters cover the details of each of these. Here we contrast them from a high level.
GCC (Overview)
GCC is:
- Free and open source
- Available on essentially every Linux system
- The “baseline” compiler for many open‑source scientific projects
Reasons it is common in HPC:
- Strong support for C, C++, and Fortran
- Reasonable optimization quality across many architectures
- Fast moving with standards support (modern C++ and recent Fortran)
GCC is often used as:
- The default or “fall‑back” compiler
- A reference to check that vendor compilers are not miscompiling code
- A good general‑purpose choice when you don’t need vendor‑specific tuning
Intel oneAPI Compilers (Overview)
Intel’s compilers target Intel CPUs and GPUs and are widely deployed on x86‑based clusters. They are:
- Tuned for Intel architectures (vector units, cache hierarchy, etc.)
- Usually paired with Intel math and communication libraries
Reasons they are popular in HPC:
- Often deliver strong performance for codes that vectorize well
- Deep integration with Intel performance libraries (e.g., BLAS/LAPACK variants)
- Good OpenMP support for multi‑core and offload to Intel GPUs
They are typically used when:
- You run on Intel CPUs and want maximum performance
- You rely on Intel‑specific extensions (e.g., some offload models)
- You need tight integration with Intel math or MPI libraries
LLVM‑Based Compilers (Overview)
LLVM is a modular compiler framework. Many “compilers” are actually front-ends built on LLVM, for example:
- Clang (C/C++)
- Flang and other Fortran front‑ends
- Vendor distributions: AMD AOCC, NVIDIA HPC SDK front‑ends, ARM compilers, etc.
Reasons LLVM‑based compilers appear in HPC:
- Rapid adoption of new language features
- Good diagnostics and tooling support
- Vendor‑tuned distributions for specific architectures (AMD, ARM, etc.)
You’ll often encounter LLVM in contexts like:
- Non‑Intel CPU architectures (AMD EPYC, ARM, some RISC‑V systems)
- GPU‑oriented environments (e.g., NVIDIA, AMD ROCm SDKs)
- Research and cutting‑edge compiler features
Typical Language and Parallelism Support
Most “HPC compilers” you’ll meet share a common baseline:
- Languages
- C and C++ (up to at least C++17/20 on modern versions)
- Fortran (often up to Fortran 2008 or beyond, with varying completeness)
- Parallelism
- OpenMP for shared‑memory CPU parallelism
- Support for compiling MPI codes (MPI itself is a library; the compiler must handle its headers/modules correctly)
- Some degree of GPU or accelerator support:
- OpenMP target offload (vendor‑dependent)
- OpenACC on some compilers
- CUDA/HIP support via separate toolchains or integrated front‑ends (covered in accelerator chapters)
The level of completeness and performance for these features often differs by compiler family and version.
How HPC Systems Present Compilers
On a typical cluster you almost never call /usr/bin/gcc directly for serious work. Instead you use environment modules to load specific compiler stacks, for example:
module avail
module load gcc/12.2.0
module load intel-oneapi/2024.0
module load llvm/17.0Common characteristics:
- Multiple versions of each compiler family are installed:
- Needed for reproducibility and compatibility with different software
- Compiler families are linked to specific libraries:
- BLAS, LAPACK, MPI, and I/O libraries may differ per toolchain
- Default compiler wrappers:
mpicc,mpicxx,mpifortare often configured to call a particular underlying compiler family, depending on which module you loaded
Understanding which compiler you are actually using is critical for:
- Performance comparisons
- Reproducing results
- Debugging compile or runtime issues
Cross‑Compiler Wrappers and Toolchain Consistency
In HPC environments you often use “compiler wrappers” provided by MPI or vendor environments rather than the raw compilers:
mpicc,mpicxx,mpifort(MPI wrappers)- Vendor wrappers like
icx,ifx,nvc,nvc++,flang, etc.
Reasons wrappers exist:
- Automatically add the correct include paths and libraries
- Ensure consistency between compiler, MPI, and math libraries
- Provide an easy way to switch entire toolchains by changing modules
Be careful to:
- Avoid mixing object files or libraries compiled with very different compilers unless you know it is supported
- Keep your build environment consistent (same compiler family and major version for all parts of a given application, in most cases)
Performance and Compiler Choice
On HPC systems, picking a compiler is often a performance decision as much as a correctness decision:
- Different compilers can produce code with different:
- Vectorization quality
- Register and cache usage
- Inlining and loop transformation strategies
- For some applications, performance differences between compilers can be modest; for others, they can be large (sometimes 2× or more)
Common practical approach:
- Start with a widely used baseline compiler on your system (often GCC or a recommended system default).
- Build and run a small performance test.
- Rebuild the same code with another compiler family (Intel, LLVM-based vendor compiler, etc.) using similar optimization levels.
- Compare performance and stability using your typical workloads.
Understanding and tuning compiler flags is crucial for serious performance work, but those details are covered in the “Compiler optimization flags” and “Debug/Optimized builds” chapters.
Portability vs. Vendor‑Specific Compilers
In HPC you often need to balance portability and performance:
- Portable approach
- Use GCC or a widely available LLVM variant
- Stick to standard C/C++/Fortran and standard OpenMP/MPI
- Easier to move your code between different clusters and architectures
- Vendor‑specific approach
- Use Intel oneAPI on Intel systems, AOCC on AMD, vendor-tuned Clang/Flang on ARM, etc.
- Possibly use vendor-specific extensions or pragmas for extra performance
- May get better performance on that vendor’s hardware, but:
- You might need adjustments when moving to other architectures
- You may have less consistent behavior across systems
A common pattern is:
- Develop and maintain a code base that is portable and standards‑compliant.
- Use build‑system options (e.g., CMake toolchain files, Makefile variables) to select different compilers and flags per machine.
- Compare performance across the available compilers on each target system.
How Build Systems Interact with HPC Compilers
Although details of Make and CMake appear in later chapters, it is important here to understand the relationship between build systems and compiler families:
- Build systems typically expose variables to select compilers:
CCfor C,CXXfor C++,FCfor Fortran in Make-based systems- CMake variables like
CMAKE_C_COMPILER,CMAKE_CXX_COMPILER,CMAKE_Fortran_COMPILER - On HPC systems, you usually:
- Load the module for the compiler stack you want.
- Configure your build system to use that compiler (or let the environment default take over).
- Optionally, specify MPI wrappers instead of the raw compiler commands for parallel codes.
Because clusters often have multiple compiler stacks, your project might provide:
- Separate build directories or configurations, e.g.:
build-gccbuild-intelbuild-llvm- Toolchain or configuration files that encode which compiler family and flags to use on which system.
Practical Tips for Working with Multiple HPC Compilers
- Ask your site documentation:
- Most HPC centers recommend a particular compiler and module set as a starting point.
- Check the compiler version explicitly:
gcc --version,icx --version,clang --version, etc.- Keep records:
- Note which compiler and version you used in your scripts and reports for reproducibility.
- Test with more than one compiler when possible:
- Different compilers can catch different bugs or undefined behavior in your code.
- Be cautious when mixing compilers and libraries:
- If you must link object files from different compilers, consult your system’s documentation and test thoroughly.
The following subchapters dive into the specifics of the three main compiler families you will encounter most often in HPC: GCC, Intel oneAPI, and LLVM-based compilers.