Kahibaro
Discord Login Register

Developing code locally

Why Develop Code Locally First?

Developing on your local machine (laptop/workstation) before moving to an HPC cluster offers:

The goal is not to fully replicate the cluster, but to make it easy to move your work there later with minimal changes.

Choosing a Local Development Environment

Operating system considerations

In all cases, aim to have:

Local vs remote development workflows

Common patterns:

This chapter focuses on the local side: getting code and tools set up so that moving to the cluster is smooth.

Editors and IDEs for HPC-Oriented Workflows

Text editors

When choosing an editor, consider:

You do not need an elaborate IDE to write effective HPC code; consistency and familiarity matter more.

Full IDEs

If you prefer a full IDE:

Keep in mind: the IDE itself will not run on the cluster’s compute nodes; you will still need to understand command-line tools eventually.

Local Toolchain Setup

Installing compilers

Aim to install at least these:

Examples:

  sudo apt update
  sudo apt install build-essential gfortran python3 python3-venv
  sudo dnf groupinstall "Development Tools"
  sudo dnf install gcc-gfortran python3 python3-virtualenv
  brew install gcc cmake make python

If your cluster uses a specific compiler (e.g., Intel, NVIDIA HPC SDK), you typically won’t install the same locally, but try to match language versions and major features (e.g., C++17 support).

Build systems and tools

Install the tools likely to be used on the cluster:

Examples:

  sudo apt install cmake ninja-build git
  brew install cmake ninja git

Project Structure for Easy Cluster Migration

A clean project layout makes it almost trivial to move code between local and cluster environments.

Suggested layout

myproject/
  src/
    main.c
    solver.c
    solver.h
  include/
    myproject/
      config.h
  tests/
    test_solver.c
  CMakeLists.txt       # or Makefile
  README.md
  scripts/
    run_small.sh
    run_large_cluster.sh

Principles:

Avoiding environment-specific assumptions

Try not to hard-code:

Example using Makefile with overridable variables:

CC ?= gcc
CFLAGS ?= -O2 -Wall
all: myprog
myprog: main.o solver.o
	$(CC) $(CFLAGS) -o $@ $^
clean:
	rm -f *.o myprog

On the cluster you can then run:

make CC=icc CFLAGS="-O3 -xHost -qopenmp"

without changing your source.

Developing with Parallelism Locally

You often want to test parallel concepts locally before going to the cluster, but with smaller scale.

OpenMP on your machine

If your compiler supports OpenMP (most do):

  gcc -O2 -fopenmp -o myprog main.c
  export OMP_NUM_THREADS=4
  ./myprog

Local machines usually have fewer cores than cluster nodes, but they still let you:

MPI on your machine

To experiment with MPI locally:

  1. Install an MPI implementation (e.g., MPICH, Open MPI).
  2. Compile and run with a small number of processes.

Example on Ubuntu:

sudo apt install mpich
mpicc -O2 -o mympi main.c
mpirun -np 4 ./mympi

You cannot reproduce cluster-scale runs locally, but you can:

GPUs locally

If you have a GPU and want to experiment with accelerator code:

If you do not have a GPU, still structure your code so:

Local Testing vs Cluster-Scale Testing

Designing test cases for local runs

Your local tests should be:

Strategies:

Example:

Debug vs release builds locally

Maintain at least two build configurations locally:

With CMake, for example:

cmake -S . -B build-debug -DCMAKE_BUILD_TYPE=Debug
cmake -S . -B build-release -DCMAKE_BUILD_TYPE=Release
cmake --build build-debug
cmake --build build-release

On the cluster, you can reuse the same CMakeLists.txt with different compilers and flags.

Managing Dependencies Locally

Your code may rely on external libraries (e.g., BLAS, FFT, HDF5). Local handling of dependencies should prepare you for cluster usage.

Using system packages

For commonly available libraries:

  sudo apt install libopenblas-dev libfftw3-dev libhdf5-dev

This is usually sufficient for development and small tests.

Using virtual environments (Python)

If your project involves Python:

  python3 -m venv venv
  source venv/bin/activate
  pip install numpy mpi4py

Abstracting dependency locations

Use build system features to avoid hard-coding paths.

Example with CMake’s find_package:

find_package(FFTW3 REQUIRED)
target_link_libraries(myprog PRIVATE FFTW3::fftw3)

On your local machine, CMake will find the system-installed library; on the cluster, it can find the library provided through modules or custom installations.

Using Version Control in an HPC Context

Version control is critical when your code exists both locally and on clusters.

Basic workflow with Git

Typical pattern:

  1. Create a repository locally:
   git init myproject
   cd myproject
   git add .
   git commit -m "Initial commit"
  1. Host it on a platform (GitHub, GitLab, institutional Git server).
  2. On the cluster, clone the same repository:
   git clone git@github.com:username/myproject.git
  1. Synchronize changes with git pull and git push.

Benefits:

Ignore local-only files

Use a .gitignore file to avoid committing build artifacts or local configuration:

build/
*.o
*.exe
*.out
*.log
*.swp
venv/

If you have IDE-specific files, add them as well (e.g., .vscode/, .idea/).

Emulating Cluster-Like Conditions Locally

You cannot fully reproduce an HPC environment, but you can approximate aspects of it to prepare your code.

Resource limitations

Test how your code behaves with limited resources:

  export OMP_NUM_THREADS=2

Simulating batch-like runs

Even without a scheduler:

Example:

#!/usr/bin/env bash
set -e
export OMP_NUM_THREADS=4
./myprog input_small.dat > output.log 2>&1

This is conceptually similar to a job script and prepares you for the cluster’s batch system.

Containers for closer replication

If your cluster uses containers or you want a more controlled environment:

Even if you do not use containers on the cluster, they can help standardize your local environment.

Moving from Local to Cluster

To transition smoothly:

  1. Ensure portability:
    • Use standard C/C++/Fortran and widely supported libraries when possible.
    • Avoid OS-specific APIs unless guarded with #ifdefs or equivalents.
  2. Externalize configuration:
    • Problem size, file paths, and performance tuning parameters should be controlled via command-line arguments or configuration files, not hard-coded.
  3. Minimize assumptions about hardware:
    • Do not assume a fixed number of cores or GPUs.
    • Read those values from environment variables or provide them as runtime options.
  4. Document your local setup:
    • Write down how to build and run the code locally.
    • This documentation forms the basis for cluster-specific instructions.

When you first move to the cluster:

Practical Development Workflow Summary

A practical, beginner-friendly workflow:

  1. Locally
    • Set up compilers, build system, and editor.
    • Create a clean project structure with Git.
    • Implement features incrementally.
    • Write small tests and run them frequently.
    • Use debug builds and basic debugging tools.
  2. Pre-cluster check
    • Ensure the project builds cleanly with a single command (make, cmake --build, etc.).
    • Remove hard-coded paths and machine-specific settings.
    • Commit your working state.
  3. On the cluster
    • Clone or pull the repository.
    • Adjust compiler/build flags via environment variables or build configuration, not by changing source every time.
    • Run the same small tests first, then scale up.

Developing code locally in this disciplined way dramatically reduces friction later, when you begin running large jobs on shared HPC systems.

Views: 10

Comments

Please login to add a comment.

Don't have an account? Register now!