Kahibaro
Discord Login Register

Job Scheduling and Resource Management

Big picture: why scheduling exists in HPC

On an HPC cluster, hundreds or thousands of users share a finite set of powerful nodes. Unlike a personal workstation, you normally cannot just start a large parallel job directly on the login node or choose arbitrary compute nodes for yourself. This would:

A job scheduler (also called a batch system or resource manager in combination with a scheduler) sits between users and the hardware. You describe what you want to run and what resources you need; the scheduler decides when, where, and for how long it will run.

This chapter focuses on:

Core concepts: jobs, resources, and policies

Most HPC schedulers share a common conceptual model, even if their commands differ:

In many systems, a single software stack (e.g. SLURM, PBS Pro, LSF) handles both scheduling and resource management; conceptually, the roles are still useful to distinguish:

Job lifecycle

A typical job’s lifecycle within the scheduler:

  1. Submission
    You submit a job description (job script or interactive job request). It specifies:
    • Resources needed (time, cores, nodes, memory, GPUs).
    • Which partition/queue to use.
    • The executable and arguments.
    • Optional: dependencies on other jobs, account to charge, etc.
  2. Queued / pending
    The job waits in a queue. It gets a status like “pending” while it is waiting for:
    • Matching resources (enough free cores/memory/GPU nodes).
    • Priority over other pending jobs.
    • Compliance with time-of-day or maintenance windows.
  3. Scheduling / starting
    The scheduler chooses nodes that meet the job’s requirements and marks those resources as allocated. The job’s startup scripts are run, environment is set up, and your executable starts.
  4. Running
    While running:
    • Resource limits (time, memory, cores) are enforced.
    • Usage is accounted for (CPU time, energy, etc., depending on system).
  5. Completion / termination
    The job finishes successfully, fails, or is killed (e.g. time limit exceeded, manual cancellation, node failure). The scheduler:
    • Releases the resources back to the pool.
    • Optionally records accounting information.
    • Writes your job’s output and error logs.

Understanding this lifecycle helps explain most of the behavior you see when interacting with a batch system (why jobs wait, why they get killed at the time limit, etc.).

Types of jobs in batch systems

Schedulers typically support multiple job “modes” that suit different usage patterns:

These are all just variations on the central idea: describe what you need and what to run; the system executes when resources become available.

Batch (non-interactive) jobs

This is the most common type on HPC clusters.

Characteristics:

Typical use cases:

Implications:

Interactive jobs

Interactive jobs give you a command-line shell on allocated compute resources rather than starting a pre-defined script. This differs from just working on the login node because the interactive session:

Typical use cases:

From the scheduler’s point of view, these are still jobs with requested resources and time; they are simply driven by your shell rather than a pre-written batch script.

Array jobs

Array jobs are a way to submit many similar jobs as a single logical entity.

Characteristics:

Typical use cases:

Scheduling advantage:

Algorithmically, array jobs align with embarrassingly parallel workloads: largely independent tasks that do not need to communicate.

Job dependencies and workflows

Schedulers often support declaring dependencies between jobs, for example:

This enables simple workflow orchestration without external tools:

At the scheduling level, dependencies:

Resource types and how they’re expressed

Different schedulers use different flags, but conceptually you ask for:

Each of these affects both when your job starts and how it runs.

Wall-clock time

You tell the scheduler how long your job is allowed to run: the wall time or time limit.

Why it matters:

Practical strategy:

CPU cores, tasks, and nodes

Scheduler requests distinguish between:

At the scheduling level:

Implications:

Cluster policies often limit maximum cores per user or per job to avoid starvation of other users.

Memory

You typically request memory:

The resource manager enforces memory limits:

Implications:

Some clusters offer special high-memory partitions/nodes. These usually have:

GPUs and accelerators

When GPUs or other accelerators are present:

From a scheduling perspective:

GPU allocation interacts with CPU and memory:

Special features and constraints

Complex clusters may tag nodes with features:

By specifying constraints (e.g. “need GPU type X”), you:

Schedulers attempt to balance these constraints with efficient resource usage and fairness among users.

Queues / partitions and cluster policies

Clusters organize their resources through queues (also called partitions, classes, or projects). Each queue has associated policies:

Schedulers use queues to implement:

As a user, choosing the right queue:

Priorities, fairness, and accounting

Schedulers have to juggle many users and jobs simultaneously while enforcing:

They do this using priorities, fair-share policies, and sometimes accounting/billing models.

Priority factors

A job’s effective scheduling priority usually depends on several factors, such as:

Exact formulas are system-dependent, but the effect is that:

Fair-share and quotas

Fair-share aims to prevent a single user or group from monopolizing the cluster. It can be implemented via:

Understanding this helps you interpret scheduler behavior:

Many centers also track usage for reporting or billing (internal or external). That accounting may affect priority indirectly (e.g. projects that overrun their allocation may be deprioritized).

Backfilling and utilization

A key scheduling strategy in HPC is backfilling:

Effects:

Implications for users:

Preemption and job interruptions

Some systems use preemption:

Common scenarios:

From a resource-management viewpoint:

User responsibilities in resource management

While schedulers automate placement and priority, effective resource management relies heavily on user behavior.

As a user, you are responsible for:

Doing these well improves:

Interaction with other parts of the HPC stack

Job scheduling and resource management form the bridge between:

In this chapter we focused on:

Other chapters will cover:

Together, these aspects will let you turn your code or application into reproducible, efficient workloads on real HPC systems.

Views: 21

Comments

Please login to add a comment.

Don't have an account? Register now!