15.3 Introduction to Singularity and Apptainer

Table of Contents

Background and Naming

Apptainer is a container system designed specifically for HPC environments. It originated as Singularity, which became widely used on clusters because it solved security and usability problems that are common with other container engines. The Singularity project later moved into the Linux Foundation and was renamed Apptainer. Many clusters and documents still refer to “Singularity,” and in practice you will see both names.

For this course, you can treat “Singularity image” and “Apptainer image” as the same concept. On some systems the main command is called singularity, on others it is called apptainer. Many examples still use the old name, but the behavior is essentially the same.

The key idea is that Apptainer provides portable, reproducible environments that can run on HPC systems without requiring administrator privileges and without breaking existing job scheduling or security models.

Important: Apptainer is designed for unprivileged use on multi‑user systems. You can run containers as a regular user, keeping your HPC account, quotas, and scheduler policies in place.

Why Apptainer Is Popular in HPC

On HPC clusters, you usually cannot run Docker because Docker requires a daemon with elevated privileges. That is a security risk on shared systems. Apptainer avoids this by running everything under your own user ID and integrating cleanly with the batch scheduler and the parallel filesystem.

Typical reasons HPC centers adopt Apptainer include:

Apptainer respects your user identity. Files created in the container are owned by you, not by a root user inside the container.

Apptainer integrates with the host’s MPI libraries, GPUs, and interconnects, so you can run large parallel jobs from within containers.

Apptainer defaults to binding your home directory and common filesystem locations into the container. Your existing input data, scripts, and job outputs are available inside the container without extra configuration.

Because of these properties, Apptainer is a central tool in reproducible HPC workflows. You can fix an entire software stack in an image and share that image with collaborators or reuse it on different clusters.

The Basic Model: Running a Single-File Image

An Apptainer container is usually stored as a single file with an extension like .sif. SIF stands for Singularity Image Format. This file contains:

A Linux filesystem with installed software and dependencies.

Metadata about the image, such as labels and definitions.

Optional signatures or checksums to verify integrity.

On an HPC cluster you normally do not run a daemon or start a separate container service. Instead, you directly invoke a command like:

singularity exec myenv.sif python script.py

or, if your system uses Apptainer:

apptainer exec myenv.sif python script.py

This command runs python script.py inside the container environment defined by myenv.sif. From your perspective, it feels like running any other command, but with a different set of libraries, compilers, and tools.

Core usage pattern:

Obtain or build a .sif image file.
Use exec, shell, or run to perform work inside that image.

Typical Commands: `exec`, `shell`, and `run`

Apptainer has a small set of core subcommands that you will use frequently. The exact options and advanced features are covered elsewhere, but the basic actions are:

singularity shell image.sif or apptainer shell image.sif starts an interactive shell inside the container. Your prompt changes, but your user ID and many host paths are still available. This is useful for exploratory work, testing, or debugging.

singularity exec image.sif command args... runs a single command inside the container and then exits. This is the common pattern in batch scripts. For example:

singularity exec path/to/env.sif ./my_parallel_code

singularity run image.sif runs the container’s default entrypoint, if one is defined in the image. This is similar to running a preconfigured application or workflow. The entrypoint is set when the image is built.

On most clusters you will mix these commands with scheduler directives from systems like SLURM, using the container as a way to manage software while keeping job submission and resource allocation unchanged.

Building Images: Definition Files and `build`

On many HPC clusters you cannot perform privileged operations, so you often build images on a local workstation or on a dedicated build node, then transfer the image to the cluster.

Container images are usually created from a definition file, often with extension .def. A simple definition file might describe:

The base operating system image, such as ubuntu:22.04.

System packages to install.

Python or other language packages to add.

Environment variables and default run commands.

You build an image with a command like:

sudo singularity build myenv.sif myenv.def

or, with Apptainer:

sudo apptainer build myenv.sif myenv.def

The sudo is often required because building from scratch may need privileged operations such as installing system packages. Some setups support unprivileged builds that use different backend technologies; your site documentation will tell you which model applies.

After the image is built, the .sif file can travel with you:

You can copy it to your home directory on a cluster.

You can store it on a shared filesystem so collaborators or multiple jobs can use it.

As long as Apptainer or Singularity is available, the image will encapsulate the same environment.

Rule of thumb: Build once, run many times. Use the same .sif image across different runs and even different clusters to improve reproducibility.

Using Container Registries and External Sources

You do not always have to build images yourself. Apptainer can pull images from several sources, including container registries that you may already use.

Common patterns include:

Pulling from an OCI registry such as Docker Hub or a private registry. For example:

apptainer pull ubuntu.sif docker://ubuntu:22.04

This command converts the remote Docker image into a SIF file that you can run directly on the cluster.

Pulling from registries that are maintained by your organization or research community. Many scientific software stacks are published as container images that can be turned into .sif files.

Mirroring or caching images on a shared filesystem to reduce repeated downloads and to keep a consistent environment for a project or team.

Because SIF is a single-file format, it is easy to store these images under version control of your choice or reference them in documentation by an exact filename and checksum.

Filesystems and Bind Mounts

A key difference between Apptainer and many other container engines is the handling of host filesystems. By default, Apptainer automatically makes several host paths available inside the container. Typical examples are:

Your home directory.

The current working directory when you invoke exec or shell.

Shared parallel filesystems like /scratch, /projects, or other site-specific locations, depending on configuration.

This default behavior is important for HPC workflows because:

Input data that lives on project or scratch storage is immediately accessible inside the container.

Output files appear where the scheduler and filesystem expect them to be, which keeps accounting, quotas, and backup policies consistent.

You can still customize which directories are visible inside the container with command line options such as --bind, but in many basic use cases the defaults are sufficient.

In HPC, data usually stays on the host filesystem. The container is primarily used for the software environment, not as the main place to store large datasets.

MPI, GPUs, and Hardware Integration

On clusters, containers must cooperate with specialized hardware and software stacks such as MPI libraries, GPUs, and high-speed interconnects. Apptainer is designed with this requirement in mind.

For MPI applications, a typical pattern is:

Use the system-provided MPI implementation that is tuned for the cluster hardware.

Compile or run your application such that it calls MPI from inside the container while relying on the host MPI libraries and network drivers.

Launch the job using the native MPI launcher provided by the scheduler, for example:

srun singularity exec mympi.sif ./my_mpi_program

The exact details depend on your site’s configuration, but the goal is that the container environment does not hide the optimizations provided by the cluster.

For GPU workloads, Apptainer supports exposing host GPUs inside the container with dedicated flags, usually a variant of --nv. This makes the same CUDA or GPU drivers available to the container that are present on the host. As a result, you can package GPU-enabled applications and libraries into an image while still taking advantage of the host’s driver stack.

Reproducible Workflows with Apptainer

Within the broader topic of reproducibility and software environments, Apptainer is a practical way to:

Freeze a specific combination of compiler, libraries, and tools.

Document the software environment through the definition file and image metadata.

Share a single artifact, the .sif image, across collaborators and systems.

A common pattern in reproducible HPC workflows is:

Keep the image definition file under version control next to your source code.

Build an image for each tagged version of your project or for each major change in dependencies.

Record in your research notes or publications which image was used, for example by including a filename, a checksum, or a container registry tag.

Use the same image for development, testing, and production runs as far as possible, so that you minimize environment-specific differences.

Reproducibility tip: Treat the container image as part of your experiment specification. A result is not fully reproducible unless both code and environment are recorded.

Common Usage Patterns on Clusters

Although exact workflows depend on the site, beginner users on an HPC cluster often follow patterns such as:

Obtain or build a suitable image locally, then upload image.sif to a project directory on the cluster.

Test interactively with a small allocation, using singularity shell or apptainer shell to explore the environment, run small test problems, and verify library availability.

Integrate the container into a batch job script by replacing regular command invocations with singularity exec image.sif command. Scheduler directives and resource requests stay the same.

Keep multiple images for different tasks, for example one image for compiling a code and another, lighter image for running large simulations.

As you gain experience, you can refine your build process, image size, and binding options. The fundamental principle remains that the scheduler controls resources and job placement, while Apptainer controls the software environment inside the job.

Security and Policy Considerations

Finally, because Apptainer is designed for multi-user systems, it pays attention to security constraints. Typical site policies include:

Disallowing privileged operations in containers. You remain your normal user inside the container, which prevents escalations.

Restricting image building on compute nodes. You might be asked to build images only on special build nodes or on your own workstation.

Providing site-curated images for common workloads. HPC support staff might maintain images for widely used applications, which you can use directly without building your own.

When working with Apptainer and Singularity, always check your site-specific documentation to learn which versions are installed, which commands (singularity or apptainer) you should use, and which policies apply to image building and registry access.

In practice, understanding these basic ideas is enough to begin using Apptainer as a reliable tool for managing software environments and improving the reproducibility of your HPC work.

Comments

Please login to add a comment.

Don't have an account? Register now!