Table of Contents
Role of GCC in HPC
GCC (GNU Compiler Collection) is one of the most widely used compilers in HPC, primarily because it is:
- Free and open source
- Available on almost every HPC system
- A reference implementation for many language features
- A baseline for performance comparisons with vendor compilers
On clusters, you will almost always find multiple GCC versions installed side by side, typically selected via environment modules (covered elsewhere).
This chapter focuses on using GCC for C/C++/Fortran in an HPC context, not on general compilation concepts.
Languages and Frontends Relevant to HPC
GCC supports many languages; in HPC you will mainly encounter:
gcc– C compilerg++– C++ compilergfortran– Fortran compiler
They share most command-line options, but some options are language-specific (e.g., Fortran standards, C++ dialects).
Typical language standard flags:
- C:
-std=c11,-std=c99 - C++:
-std=c++17,-std=c++20 - Fortran:
-std=f2008,-std=f2018(support evolves with GCC versions)
Example:
gcc -std=c11 -O2 main.c -o main
g++ -std=c++17 -O3 solver.cpp -o solver
gfortran -std=f2008 -O2 climate.f90 -o climateBasic Usage Patterns for HPC
In HPC, you rarely compile a single trivial file; you often build large, performance-critical codes. Key patterns:
- Separate compilation and linking
- Building with different optimization levels for debug vs production
- Targeting specific instruction sets of the cluster CPUs
Examples:
# Compile only
gcc -O2 -c step1.c -o step1.o
gcc -O2 -c step2.c -o step2.o
# Link
gcc step1.o step2.o -o simulation
# C++ equivalent
g++ -O3 -c solver.cpp -o solver.o
g++ solver.o -o solverOptimization Levels in GCC
Optimization flags matter greatly in HPC. The main levels:
-O0– No optimization, very slow but best for debugging-O1– Basic optimizations, faster, still relatively quick to compile-O2– “Normal” high optimization; default for many HPC builds-O3– More aggressive optimizations (loop unrolling, inlining); can help or hurt depending on code-Ofast– Like-O3plus unsafe math optimizations (breaks strict IEEE/standards)-Og– Optimized for debugging (keeps debuggability while still optimizing somewhat)
Typical HPC usage:
- Development / debugging:
-O0 -gor-Og -g - Production runs:
-O2or-O3; sometimes-Ofastif numerics allow
Examples:
# Debug build
gcc -O0 -g main.c -o main_debug
# Typical optimized build
gcc -O2 main.c -o main
# More aggressive
gcc -O3 -march=native main.c -o main_fastCPU-Specific Tuning and Vectorization
On HPC systems, you want to exploit the specific CPU microarchitecture and vector units.
Architecture and Tuning Flags
Common GCC flags:
-march=<arch>– Generate instructions for a specific CPU architecture-mtune=<arch>– Optimize scheduling for a CPU, but keep code compatible with a wider range (if-marchnot too specific)-march=native– Tune for the CPU you are compiling on (not portable binaries)
Cluster-specific settings are usually documented by the center; common values on x86-64 clusters:
-march=haswell-march=broadwell-march=skylake-avx512-march=zen2,-march=znver3, etc.
Example:
# Compile targeting the login node's CPU (not always same as compute nodes!)
gcc -O3 -march=native code.c -o code_native
# Safer: use documented architecture of compute nodes
gcc -O3 -march=skylake-avx512 -mtune=skylake-avx512 code.c -o code_sklControlling Vectorization
GCC auto-vectorizes loops under higher optimization levels:
- Auto-vectorization generally kicks in at
-O3or-Ofast - Explicit flags:
-ftree-vectorize(enabled by default with-O3)-fno-tree-vectorizeto disable
To inspect vectorization:
-fopt-info-vec– Report vectorization decisions-fopt-info-vec-optimized– Show only successful vectorizations-fopt-info-vec-missed– Show missed opportunities (useful for tuning)
Example:
gcc -O3 -march=skylake-avx512 -fopt-info-vec-missed \
stencil.c -o stencil
This produces a report on stderr indicating which loops were or were not vectorized and why.
Debugging and Diagnostic Options
GCC provides several diagnostic options that are especially helpful with parallel and numerical HPC codes.
Debug Symbols
-g– Include debug symbols for use with debuggers (e.g.,gdb, parallel debuggers)
Example:
gcc -O0 -g main.c -o main_gdbWarnings and Static Checks
- Enable warnings:
-Wall -Wextra - Be stricter:
-Wpedantic(C/C++),-pedanticor-std=...with-Wall - Treat warnings as errors:
-Werror(often used in CI, not always during fast prototyping)
Example:
gcc -O2 -Wall -Wextra -Wpedantic main.c -o main_strictFor large HPC codes, some teams selectively disable warnings that are too noisy.
Sanitizers (Useful for Development)
Not typically used in production builds, but very valuable in development for catching bugs:
-fsanitize=address– Detect memory errors (out-of-bounds, use-after-free)-fsanitize=undefined– Detect undefined behavior-fsanitize=thread– Detect data races (with threads)
Example (development-only):
gcc -O1 -g -fsanitize=address -fsanitize=undefined main.c -o main_asanSanitizers slow down execution and increase memory usage, so they are run on small test cases.
Linking with Libraries
HPC applications almost always depend on external libraries (math, MPI, numerical libraries, etc.). GCC is used as the linker driver in many cases.
Key flags (applies to gcc, g++, gfortran):
-L<dir>– Add library search directory-l<name>– Link with librarylib<name>.soorlib<name>.a- Library order on the command line matters (especially for static libraries)
Example:
# Link with math library (libm)
gcc main.c -lm -o main
# Link with libraries installed in /opt/lib
gcc main.c -L/opt/lib -lmyhpc -lm -o mainFor mixed-language HPC codes:
- Use the compiler that matches the “main” language to link
- C main program: use
gcc - C++ main program (or uses C++ libraries): use
g++ - Fortran main program: use
gfortran
This ensures correct runtime support libraries are linked.
GCC and Parallel Programming Models
Detailed models (OpenMP, MPI) are covered elsewhere; here is how GCC interacts with them.
OpenMP with GCC
GCC has strong support for OpenMP.
- Enable OpenMP:
-fopenmp - This:
- Defines
_OPENMPmacro - Enables OpenMP pragmas/directives
- Links against the GCC OpenMP runtime (
libgomp)
Examples:
# C OpenMP program
gcc -O2 -fopenmp omp_example.c -o omp_example
# C++ OpenMP program
g++ -O2 -fopenmp omp_example.cpp -o omp_example
# Fortran OpenMP program
gfortran -O2 -fopenmp omp_example.f90 -o omp_exampleThe exact OpenMP version supported depends on the GCC version. Newer GCC releases add more of OpenMP 4.x/5.x features.
MPI with GCC
On HPC systems, you do not usually call GCC directly for MPI codes. Instead you use MPI wrapper compilers, which are often based on GCC:
mpicc– for CmpicxxormpiCC– for C++mpifortormpif90– for Fortran
These wrappers:
- Call the underlying GCC/G++/GFortran
- Add correct include and library paths for the MPI implementation
However, understanding that these wrappers ultimately invoke GCC helps when debugging build problems and when interpreting compiler diagnostics.
Example:
mpicc -O3 -march=skylake-avx512 -fopenmp hybrid.c -o hybrid
Here, mpicc will pass -O3 -march=... -fopenmp to GCC internally.
Position-Independent Code and Shared Libraries
Many HPC libraries are built as shared objects (.so). GCC flags for this:
-fPIC– Generate position-independent code (needed when building shared libraries)-shared– Produce a shared library
Example:
# Build object files with position-independent code
gcc -O2 -fPIC -c module1.c -o module1.o
gcc -O2 -fPIC -c module2.c -o module2.o
# Create shared library libmymod.so
gcc -shared module1.o module2.o -o libmymod.soEnd users typically just link against these libraries; developers of HPC libraries need these flags.
GCC Versions and Module Systems on Clusters
On HPC clusters, multiple GCC versions are usually installed:
- System GCC (often older, from the OS)
- Several newer GCC versions provided via modules, e.g.:
gcc/9.3.0gcc/10.2.0gcc/12.1.0
Implications:
- Newer standards and features (C++17/20, Fortran 2008/2018) may require newer GCC
- Performance can differ significantly between versions
- Some precompiled libraries are tied to a specific GCC version (binary compatibility)
You typically:
- Load a GCC module:
module load gcc/12.2.0- Then use
gcc,g++,gfortranas usual, picking up that version.
Cluster documentation usually specifies recommended compiler versions for various software stacks.
Common GCC Command Examples for HPC
A few concrete patterns you will likely see or use:
1. Debug vs Release Switch
# Debug build (with OpenMP)
gcc -O0 -g -Wall -Wextra -fopenmp solver.c -o solver_debug
# Release build (vectorization, tuned for CPU)
gcc -O3 -march=native -fopenmp solver.c -o solver_release2. Mixed C and Fortran
# Object files
gcc -O2 -c c_part.c -o c_part.o
gfortran -O2 -c f_part.f90 -o f_part.o
# Use gfortran to link (Fortran main program)
gfortran c_part.o f_part.o -lblas -llapack -o hybrid_solver3. Generating Vectorization Reports
gcc -O3 -march=skylake-avx512 -fopt-info-vec-optimized \
matmul.c -o matmul
# Or only missed opportunities:
gcc -O3 -fopt-info-vec-missed matmul.c -o matmulThese reports are a key tool for performance tuning in GCC-based HPC builds.
When to Prefer GCC on HPC Systems
GCC is often the default choice when:
- You need an open-source toolchain
- Portability across multiple systems is important
- You want a baseline for performance comparisons
- Vendor/compiler-specific features are not required
Some sites may also provide vendor compilers (e.g., Intel, NVIDIA HPC, Cray), which can outperform GCC for particular architectures or workloads. In practice, many teams:
- Develop and test with GCC.
- Benchmark with GCC and vendor compilers.
- Choose the best-performing toolchain for production runs.
Understanding GCC and its options is therefore essential, both as a primary tool and as a performance baseline in HPC environments.