5.4 Writing job scripts

Goals of a Job Script

A job script is a text file that tells the scheduler:

What resources you need (time, CPUs, memory, GPUs, etc.)
What environment you want (modules, variables)
What commands to run (your program, pre/post steps)
Where to send output and error messages

In most HPC clusters, job scripts are submitted to a batch system (for example SLURM) using a command like sbatch. The rest of this chapter focuses on the practical aspects of writing such scripts.

Basic Structure of a Batch Job Script

A typical batch job script has three main parts:

Shebang line – which shell to use
Scheduler directives – special comments describing resources and job options
Job body – the commands to execute

Minimal SLURM example:

#!/bin/bash
#SBATCH --job-name=my_test
#SBATCH --output=my_test_%j.out
#SBATCH --time=00:10:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=1G
echo "Running on host: $(hostname)"
echo "Starting at: $(date)"
# Load environment
module purge
module load gcc
# Run the program
./my_program input.dat

Key ideas:

Lines starting with #SBATCH are directives to SLURM, not normal shell comments.
Everything after the directives is executed by the shell you specify in the shebang.

Shebang and Shell Choice

The first line selects the shell:

#!/bin/bash – most common on Linux clusters
#!/bin/zsh, #!/bin/sh – possible alternatives if supported

The shell determines:

Available syntax ([[ ... ]], arrays, functions, etc.)
How environment variables and loops are written

For beginners, #!/bin/bash is usually the best default.

Common SLURM Directives in Job Scripts

Directives control job behavior. They usually have both a long and often a short form. Some of the most commonly used:

Job identification and accounting

--job-name=NAME (-J NAME)
A short, descriptive name that appears in queue listings.
--account=ACCOUNT (-A ACCOUNT)
Project/account to charge. Often required on shared systems.
--partition=PART (-p PART)
Queue/partition to use (e.g., short, long, gpu).

Example:

#SBATCH --job-name=matrix_mul
#SBATCH --account=project123
#SBATCH --partition=short

Time limits

--time=HH:MM:SS (-t HH:MM:SS)
Maximum wall-clock time you request.

Examples:

#SBATCH --time=00:30:00   # 30 minutes
#SBATCH --time=2-00:00:00 # 2 days (D-HH:MM:SS format)

Request only as much as you realistically need; this can improve queue wait times and system efficiency.

CPU and task layout

Typical directives (interpretation depends on how the job is launched):

--ntasks=N (-n N)
Number of tasks (often equals number of MPI processes).
--cpus-per-task=C (-c C)
Number of CPU cores (threads) per task, often used for OpenMP.
--nodes=N (-N N)
Number of nodes to allocate.

Examples:

# Single-core serial job
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
# 16 MPI processes, one per task
#SBATCH --ntasks=16
#SBATCH --cpus-per-task=1
# 4 MPI processes, each using 8 threads (e.g., OpenMP)
#SBATCH --ntasks=4
#SBATCH --cpus-per-task=8

The job script usually combines these directives with the appropriate launch command in the body (srun, mpirun, omp settings, etc.), which is covered elsewhere.

Memory requests

Memory can be requested per node, per CPU, or per task depending on the cluster configuration.

Common forms:

--mem=4G
Total memory per node (e.g., 4 gigabytes per node).
--mem-per-cpu=2G
Memory per CPU core.

Examples:

#SBATCH --mem=8G           # 8 GB total per node
#SBATCH --mem-per-cpu=2G   # 2 GB per CPU core

Check the local cluster documentation to know which style is expected.

GPUs and accelerators

If the cluster has GPUs, you typically request them like:

--gres=gpu:NUM
Generic resources (GRES), e.g., GPUs.

Example:

#SBATCH --partition=gpu
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=8
#SBATCH --mem=32G

Here, one GPU is requested, with 8 CPU cores and 32 GB of memory.

Output and error handling

You can control where the job’s output and error messages go:

--output=FILE (-o FILE)
Standard output file.
--error=FILE (-e FILE)
Standard error file.
--mail-user=EMAIL
Email address for notifications.
--mail-type=BEGIN,END,FAIL
When to send emails.

Useful placeholders:

%j – job ID
%x – job name
%u – user name

Examples:

#SBATCH --output=logs/%x_%j.out
#SBATCH --error=logs/%x_%j.err
#SBATCH --mail-user=myname@example.edu
#SBATCH --mail-type=FAIL,END

Organizing the Job Body

Within the job body, you write ordinary shell commands, but in a way that:

Reconstructs the environment reliably
Makes it easy to debug
Records useful metadata

A common pattern:

Initial info and safety checks
Environment modules and variables
Directory setup
Run commands
Final logging

Example:

#!/bin/bash
#SBATCH --job-name=heat_2d
#SBATCH --output=logs/%x_%j.out
#SBATCH --time=01:00:00
#SBATCH --partition=short
#SBATCH --ntasks=4
#SBATCH --cpus-per-task=4
#SBATCH --mem-per-cpu=1G
# 1. Log basic info
echo "Job ID: $SLURM_JOB_ID"
echo "Job name: $SLURM_JOB_NAME"
echo "User: $USER"
echo "Running on nodes: $SLURM_NODELIST"
echo "Number of tasks: $SLURM_NTASKS"
echo "CPUs per task: $SLURM_CPUS_PER_TASK"
echo "Started at: $(date)"
# 2. Setup environment
module purge
module load gcc/12.2
module load openmpi
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
# 3. Move to the directory from which the job was submitted
cd "$SLURM_SUBMIT_DIR"
# 4. Run the application
srun ./heat_2d_solver --nx 2048 --ny 2048 --steps 500
# 5. Final log
echo "Finished at: $(date)"

Using Environment Variables Provided by the Scheduler

Schedulers often set environment variables that job scripts can use. For SLURM, common ones include:

SLURM_JOB_ID – numeric job ID
SLURM_JOB_NAME – job name
SLURM_SUBMIT_DIR – directory from which you ran sbatch
SLURM_NTASKS – number of tasks allocated
SLURM_CPUS_PER_TASK – CPUs per task
SLURM_NODELIST – list of nodes allocated

Practical uses:

Ensure you are in the expected directory:

  cd "$SLURM_SUBMIT_DIR"

Control OpenMP threads:

  export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

Make logs self-describing:

  echo "Nodes: $SLURM_NODELIST"

Handling Working Directories and Paths

Common strategies in job scripts:

Use absolute paths when possible to avoid confusion:

  DATA_DIR=/scratch/$USER/data
  RESULT_DIR=/scratch/$USER/results

Create necessary directories:

  mkdir -p "$RESULT_DIR"

Move scratch data to a high-performance filesystem (if your cluster distinguishes home vs. scratch):

  cp input/*.dat "$SLURM_TMPDIR"/
  cd "$SLURM_TMPDIR"
  srun ./my_code
  cp results/* "$RESULT_DIR"/

Check your site documentation for recommended directories (e.g., $SCRATCH, $SLURM_TMPDIR, etc.).

Serial, OpenMP, MPI, and Hybrid Job Script Patterns

The directives and body structure vary with the parallel model you use. A few common patterns:

Pure serial job

#!/bin/bash
#SBATCH --job-name=serial_test
#SBATCH --output=serial_%j.out
#SBATCH --time=00:05:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=1G
cd "$SLURM_SUBMIT_DIR"
./serial_program input.dat

Shared-memory (OpenMP-style) job

#!/bin/bash
#SBATCH --job-name=openmp_job
#SBATCH --output=openmp_%j.out
#SBATCH --time=00:30:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=8G
cd "$SLURM_SUBMIT_DIR"
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
./openmp_program input.dat

Distributed-memory (MPI-style) job

#!/bin/bash
#SBATCH --job-name=mpi_job
#SBATCH --output=mpi_%j.out
#SBATCH --time=02:00:00
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=8
#SBATCH --mem=4G
cd "$SLURM_SUBMIT_DIR"
module load openmpi
srun ./mpi_program input.dat

Hybrid MPI + OpenMP job

#!/bin/bash
#SBATCH --job-name=hybrid_job
#SBATCH --output=hybrid_%j.out
#SBATCH --time=02:00:00
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=8
#SBATCH --mem=8G
cd "$SLURM_SUBMIT_DIR"
module load mpi
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
srun ./hybrid_program input.dat

The details of MPI/OpenMP themselves are covered elsewhere; here the focus is on matching directives to how you intend to run the job.

Parameter Sweeps and Simple Loops in Job Scripts

Sometimes you want a single job to run multiple related simulations with different parameters. You can use shell loops inside the job body:

#!/bin/bash
#SBATCH --job-name=param_sweep
#SBATCH --output=param_sweep_%j.out
#SBATCH --time=04:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=4G
cd "$SLURM_SUBMIT_DIR"
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
for NX in 128 256 512 1024; do
    echo "Running with NX=$NX at $(date)"
    ./solver --nx "$NX" --ny "$NX" --steps 200 > "run_NX${NX}.log"
done

This pattern is useful for small sweeps that can fit comfortably within a single job’s time and resource limits.

For large parameter sweeps, array jobs are more appropriate (introduced elsewhere), but the basic structure still resides in a job script.

Job Scripts for Interactive Sessions

Some clusters allow interactive jobs directly from the command line, but you can also use a script to request an interactive shell with certain resources:

#!/bin/bash
#SBATCH --job-name=interactive
#SBATCH --time=01:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=4G
# This command starts an interactive shell on a compute node
srun --pty bash

Submitting this script may give you a shell on a compute node with the resources you requested. You can then run commands interactively within that environment.

Common Pitfalls When Writing Job Scripts

Some frequent mistakes and how to avoid them:

Forgetting the shebang
Result: job may fail to start or use an unexpected shell.
Fix: always start with #!/bin/bash (or another explicit shell).
Mismatched directives and launch commands
Example: Asking for --ntasks=16 but running ./program (serial) instead of srun ./program.
Fix: ensure resource requests align with how you run your code.
Not changing to the submit directory
Programs run in a default directory that may not contain your input files.
Fix: add cd "$SLURM_SUBMIT_DIR" early in the job body.
Requesting inconsistent memory
Using both --mem and --mem-per-cpu in ways that conflict, or requesting more than exists per node.
Fix: check node specs and cluster policies; use one clear memory request style.
Hard-coding temporary paths incorrectly
Writing to /tmp directly on systems where node-local temporary directories are different or cleaned aggressively.
Fix: use site-provided environment variables like $SLURM_TMPDIR if available.
No logging or diagnostics
When something goes wrong, there is little information.
Fix: echo job info, and optionally add set -e (stop on first error) or set -x (print commands as they run) for debugging jobs.

Example for debugging:

#!/bin/bash
#SBATCH --job-name=debug_example
#SBATCH --output=debug_%j.out
#SBATCH --time=00:10:00
#SBATCH --ntasks=1
#SBATCH --mem=1G
set -e  # exit on error
set -x  # print commands
cd "$SLURM_SUBMIT_DIR"
./possibly_flaky_program

Local Customizations and Templates

Clusters often provide:

Sample job scripts in a shared directory
Documentation on required directives (e.g., default partition, accounting options)
Recommended settings for specific applications

A good practice is to:

Start from a working example provided by your site.
Save your own minimal template scripts for common scenarios (serial, OpenMP, MPI, GPU).
Modify copies of these templates for specific projects rather than starting from scratch.

Example template header you can adapt:

#!/bin/bash
#SBATCH --job-name=JOBNAME
#SBATCH --output=logs/%x_%j.out
#SBATCH --partition=PARTITION
#SBATCH --account=ACCOUNT
#SBATCH --time=HH:MM:SS
#SBATCH --nodes=N
#SBATCH --ntasks-per-node=T
#SBATCH --cpus-per-task=C
#SBATCH --mem=MEM
module purge
# module load ...
cd "$SLURM_SUBMIT_DIR"
# export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
# srun ./program ...

Filling in the placeholders consistently reduces errors and makes your jobs easier to manage.

Comments

Please login to add a comment.

Don't have an account? Register now!

5.4 Writing job scripts

Goals of a Job Script

Basic Structure of a Batch Job Script

Shebang and Shell Choice

Common SLURM Directives in Job Scripts

Job identification and accounting

Time limits

CPU and task layout

Memory requests

GPUs and accelerators

Output and error handling

Organizing the Job Body

Using Environment Variables Provided by the Scheduler

Handling Working Directories and Paths

Serial, OpenMP, MPI, and Hybrid Job Script Patterns

Pure serial job

Shared-memory (OpenMP-style) job

Distributed-memory (MPI-style) job

Hybrid MPI + OpenMP job

Parameter Sweeps and Simple Loops in Job Scripts

Job Scripts for Interactive Sessions

Common Pitfalls When Writing Job Scripts

Local Customizations and Templates

Comments

Where to Move