11.1.2 Intel oneAPI

Table of Contents

Overview of Intel oneAPI in HPC

Intel oneAPI is Intel’s modern, cross-architecture toolchain for high-performance applications. In HPC, it is primarily used for:

Compiling and optimizing C, C++, Fortran, and SYCL codes
Targeting CPUs (including AVX-512), GPUs, and other Intel accelerators
Leveraging performance libraries tuned for Intel hardware

Unlike older “Intel Compiler” branding, oneAPI is a broader ecosystem that includes compilers, libraries, analysis tools, and deployment utilities.

This chapter focuses on what you need to know as an HPC beginner to use Intel oneAPI compilers effectively on typical clusters.

Key Components Relevant to HPC Users

For this course, you mainly interact with:

Intel oneAPI DPC++/C++ Compiler (icx, icpx)

C/C++ and SYCL (for CPU and GPU)

Intel oneAPI DPC++/Fortran Compiler (ifx)

Modern Fortran for CPU and offload

Intel oneAPI Classic Compilers (still widely used on clusters)

icc/icpc (classic C/C++) — now in maintenance mode
ifort (classic Fortran) — also in maintenance mode

And several optimized libraries (used either directly or via other software):

OneMKL (math kernel library)
OneDNN (deep learning)
OneDAL (data analytics)
OneCCL (collective communications, often used under the hood)

On many clusters, you won’t install these yourself; you’ll load an environment module that provides them.

Using Intel Compilers on an HPC Cluster

Loading Intel oneAPI with Modules

Typical workflow (details vary per system):

bash

module avail               # see available modules
module load intel-oneapi   # or similar: intel/2023.2, intel-compilers, etc.

Common variations:

Some systems use module load intel for classic compilers only.
Others have separate modules, e.g.:

module load intel-oneapi-compilers
module load intel-oneapi-mkl

Once loaded, commands like icx, icpx, ifx, or ifort become available in your PATH.

Basic Compiler Invocation

C and C++

Compile a single C source file:

bash

icx mycode.c -o myprog

C++:

bash

icpx mycode.cpp -o myprog

On systems still using classic compilers:

bash

icc mycode.c -o myprog
icpc mycode.cpp -o myprog

Fortran

Modern oneAPI Fortran:

bash

ifx mycode.f90 -o myprog

Classic Fortran (still very common in legacy codes):

bash

ifort mycode.f90 -o myprog

MPI and Intel Compilers

Clusters often provide MPI “compiler wrappers” that are configured to use Intel compilers:

mpiicc – MPI with Intel C compiler
mpiicpc – MPI with Intel C++ compiler
mpiifort or mpiifx – MPI with Intel Fortran compiler

Example:

bash

mpiifort parallel.f90 -o parallel_mpi

These wrappers automatically add the right MPI include paths and libraries.

Optimization and Architecture Flags (Intro Level)

You will see Intel compiler flags in other HPC examples and build systems. At a beginner level, you should recognize these:

General Optimization Levels

-O0 — no optimization (useful for debugging)
-O2 — default optimization, safe and common
-O3 — more aggressive optimizations (loop transformations, vectorization)
-Ofast — enables -O3 plus optimizations that may break strict standards compliance (use with care)

Example:

bash

icx -O3 code.c -o code_opt

Targeting CPU Features

Intel compilers can generate code tuned for specific microarchitectures or instruction sets:

-xHost — generate instructions for the host CPU (may not be portable)
-march=native (for icx/icpx, Clang-like) — similar intent to -xHost
Legacy -xCORE-AVX2, -xCORE-AVX512 etc. on classic compilers

On shared clusters, using extremely specific CPU features may cause problems if nodes differ. Often, system-wide build defaults are chosen by admins; as a new user, you generally:

Use provided modules and builds, or
Stick to conservative flags unless advised otherwise

Basic Vectorization Options

Vectorization is often enabled automatically at -O2 or higher. Some related flags you might encounter:

-qopt-report=5 (classic) or similar options — detailed optimization reports
-qopenmp (classic) / -fopenmp (newer) — enable OpenMP (covered in the OpenMP chapter)

At this stage, just recognize that Intel compilers often add their own -q... or -f... options.

Debugging vs Optimized Builds with Intel oneAPI

Matching the “Debug builds” and “Optimized builds” concepts from the parent section, Intel compilers support:

Debug-Oriented Flags

-g — include debug information (for use with debuggers)
-O0 — avoid optimizations that rearrange code (easier debugging)
-traceback (classic) — better runtime error messages in Fortran/C

Example debug build:

bash

ifx -g -O0 -traceback mycode.f90 -o mycode_debug

Release/Optimized Builds

-O2 or -O3 as discussed above
Possibly combined with architecture-specific flags, set by the build system

Example:

bash

icpx -O3 -march=native main.cpp -o main_fast

In practice, you usually do not tune every flag manually; instead, you:

Use build systems (Make/CMake) that have presets for Intel
Switch between debug and release profiles via the build system

Libraries: Intel oneAPI and Math/Linear Algebra

Many HPC applications and libraries link against Intel’s math libraries. As a beginner, you should:

Recognize names like MKL (now part of oneMKL)
Understand that these are highly optimized for Intel CPUs
Know that linking is often handled for you by:

Build systems (cmake/make)
Special wrappers (e.g., mklvars.sh), or
Preconfigured modules (e.g. module load intel-oneapi-mkl)

Manual linking can be complex; a minimal C example might look like:

bash

icx mycode.c -o myprog -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -ldl

On most clusters, you will instead follow local documentation or example job scripts that already know how to link MKL correctly.

Basic Use in CMake and Makefiles (Conceptual)

Without going deep into build systems (covered later), note that:

Intel compilers can be selected in CMake via environment variables:

bash

export CC=icx
export CXX=icpx
export FC=ifx
cmake ..

In a simple Makefile, you might see:

make

CC = icx
FC = ifx
CFLAGS = -O3
FFLAGS = -O3

As a user, you often only need to adjust which compiler is used (e.g., switch from gcc to icx) by setting these variables or by loading the appropriate module before configuring the project.

When (and Why) to Use Intel oneAPI in HPC

As an absolute beginner, you don’t have to decide compilers for every project, but it helps to know typical reasons people choose Intel oneAPI on clusters:

Performance on Intel CPUs: tuned math libraries, good vectorization, and optimization
Compatibility with large legacy codes written for Intel compilers (ifort, icc)
Cross-architecture development: DPC++/SYCL support for Intel GPUs and accelerators
Integration: Many vendor-provided and community scientific codes are tested and supported with Intel toolchains

For new projects, clusters may recommend:

GCC/Clang for portability and open-source toolchains
Intel oneAPI where you need optimized math libraries or vendor-tuned performance

11.1.2 Intel oneAPI

Overview of Intel oneAPI in HPC

Key Components Relevant to HPC Users

Using Intel Compilers on an HPC Cluster

Loading Intel oneAPI with Modules

Basic Compiler Invocation

C and C++

Fortran

MPI and Intel Compilers

Optimization and Architecture Flags (Intro Level)

General Optimization Levels

Targeting CPU Features

Basic Vectorization Options

Debugging vs Optimized Builds with Intel oneAPI

Debug-Oriented Flags

Release/Optimized Builds

Libraries: Intel oneAPI and Math/Linear Algebra

Basic Use in CMake and Makefiles (Conceptual)

When (and Why) to Use Intel oneAPI in HPC

Comments

Where to Move