Table of Contents
Overview of Intel oneAPI in HPC
Intel oneAPI is Intel’s modern, cross-architecture toolchain for high-performance applications. In HPC, it is primarily used for:
- Compiling and optimizing C, C++, Fortran, and SYCL codes
- Targeting CPUs (including AVX-512), GPUs, and other Intel accelerators
- Leveraging performance libraries tuned for Intel hardware
Unlike older “Intel Compiler” branding, oneAPI is a broader ecosystem that includes compilers, libraries, analysis tools, and deployment utilities.
This chapter focuses on what you need to know as an HPC beginner to use Intel oneAPI compilers effectively on typical clusters.
Key Components Relevant to HPC Users
For this course, you mainly interact with:
- Intel oneAPI DPC++/C++ Compiler (
icx,icpx) - C/C++ and SYCL (for CPU and GPU)
- Intel oneAPI DPC++/Fortran Compiler (
ifx) - Modern Fortran for CPU and offload
- Intel oneAPI Classic Compilers (still widely used on clusters)
icc/icpc(classic C/C++) — now in maintenance modeifort(classic Fortran) — also in maintenance mode
And several optimized libraries (used either directly or via other software):
- OneMKL (math kernel library)
- OneDNN (deep learning)
- OneDAL (data analytics)
- OneCCL (collective communications, often used under the hood)
On many clusters, you won’t install these yourself; you’ll load an environment module that provides them.
Using Intel Compilers on an HPC Cluster
Loading Intel oneAPI with Modules
Typical workflow (details vary per system):
module avail # see available modules
module load intel-oneapi # or similar: intel/2023.2, intel-compilers, etc.Common variations:
- Some systems use
module load intelfor classic compilers only. - Others have separate modules, e.g.:
module load intel-oneapi-compilersmodule load intel-oneapi-mkl
Once loaded, commands like icx, icpx, ifx, or ifort become available in your PATH.
Basic Compiler Invocation
C and C++
Compile a single C source file:
icx mycode.c -o myprogC++:
icpx mycode.cpp -o myprogOn systems still using classic compilers:
icc mycode.c -o myprog
icpc mycode.cpp -o myprogFortran
Modern oneAPI Fortran:
ifx mycode.f90 -o myprogClassic Fortran (still very common in legacy codes):
ifort mycode.f90 -o myprogMPI and Intel Compilers
Clusters often provide MPI “compiler wrappers” that are configured to use Intel compilers:
mpiicc– MPI with Intel C compilermpiicpc– MPI with Intel C++ compilermpiifortormpiifx– MPI with Intel Fortran compiler
Example:
mpiifort parallel.f90 -o parallel_mpiThese wrappers automatically add the right MPI include paths and libraries.
Optimization and Architecture Flags (Intro Level)
You will see Intel compiler flags in other HPC examples and build systems. At a beginner level, you should recognize these:
General Optimization Levels
-O0— no optimization (useful for debugging)-O2— default optimization, safe and common-O3— more aggressive optimizations (loop transformations, vectorization)-Ofast— enables-O3plus optimizations that may break strict standards compliance (use with care)
Example:
icx -O3 code.c -o code_optTargeting CPU Features
Intel compilers can generate code tuned for specific microarchitectures or instruction sets:
-xHost— generate instructions for the host CPU (may not be portable)-march=native(foricx/icpx, Clang-like) — similar intent to-xHost- Legacy
-xCORE-AVX2,-xCORE-AVX512etc. on classic compilers
On shared clusters, using extremely specific CPU features may cause problems if nodes differ. Often, system-wide build defaults are chosen by admins; as a new user, you generally:
- Use provided modules and builds, or
- Stick to conservative flags unless advised otherwise
Basic Vectorization Options
Vectorization is often enabled automatically at -O2 or higher. Some related flags you might encounter:
-qopt-report=5(classic) or similar options — detailed optimization reports-qopenmp(classic) /-fopenmp(newer) — enable OpenMP (covered in the OpenMP chapter)
At this stage, just recognize that Intel compilers often add their own -q... or -f... options.
Debugging vs Optimized Builds with Intel oneAPI
Matching the “Debug builds” and “Optimized builds” concepts from the parent section, Intel compilers support:
Debug-Oriented Flags
-g— include debug information (for use with debuggers)-O0— avoid optimizations that rearrange code (easier debugging)-traceback(classic) — better runtime error messages in Fortran/C
Example debug build:
ifx -g -O0 -traceback mycode.f90 -o mycode_debugRelease/Optimized Builds
-O2or-O3as discussed above- Possibly combined with architecture-specific flags, set by the build system
Example:
icpx -O3 -march=native main.cpp -o main_fastIn practice, you usually do not tune every flag manually; instead, you:
- Use build systems (Make/CMake) that have presets for Intel
- Switch between debug and release profiles via the build system
Libraries: Intel oneAPI and Math/Linear Algebra
Many HPC applications and libraries link against Intel’s math libraries. As a beginner, you should:
- Recognize names like MKL (now part of oneMKL)
- Understand that these are highly optimized for Intel CPUs
- Know that linking is often handled for you by:
- Build systems (
cmake/make) - Special wrappers (e.g.,
mklvars.sh), or - Preconfigured modules (e.g.
module load intel-oneapi-mkl)
Manual linking can be complex; a minimal C example might look like:
icx mycode.c -o myprog -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -ldlOn most clusters, you will instead follow local documentation or example job scripts that already know how to link MKL correctly.
Basic Use in CMake and Makefiles (Conceptual)
Without going deep into build systems (covered later), note that:
- Intel compilers can be selected in CMake via environment variables:
export CC=icx
export CXX=icpx
export FC=ifx
cmake ..- In a simple
Makefile, you might see:
CC = icx
FC = ifx
CFLAGS = -O3
FFLAGS = -O3
As a user, you often only need to adjust which compiler is used (e.g., switch from gcc to icx) by setting these variables or by loading the appropriate module before configuring the project.
When (and Why) to Use Intel oneAPI in HPC
As an absolute beginner, you don’t have to decide compilers for every project, but it helps to know typical reasons people choose Intel oneAPI on clusters:
- Performance on Intel CPUs: tuned math libraries, good vectorization, and optimization
- Compatibility with large legacy codes written for Intel compilers (
ifort,icc) - Cross-architecture development: DPC++/SYCL support for Intel GPUs and accelerators
- Integration: Many vendor-provided and community scientific codes are tested and supported with Intel toolchains
For new projects, clusters may recommend:
- GCC/Clang for portability and open-source toolchains
- Intel oneAPI where you need optimized math libraries or vendor-tuned performance
As you progress through the course, you will encounter Intel oneAPI again when:
- Compiling parallel codes (MPI/OpenMP)
- Using numerical libraries (BLAS/LAPACK/ScaLAPACK via MKL)
- Performing performance analysis on Intel-based systems