Kahibaro
Discord Login Register

OpenShift and GPUs

GPU Workloads in OpenShift: Concepts and Architecture

GPU support in OpenShift extends the standard container and Kubernetes model to accelerate workloads such as AI/ML, visualization, and certain HPC workloads. The key aspects are:

In OpenShift, GPU integration is typically implemented using the Node Feature Discovery and GPU Operators, which manage kernel drivers, container runtime hooks, and device plugins in a cluster-native way.

GPU Use Cases on OpenShift in HPC Contexts

In an HPC-oriented OpenShift environment, GPUs are used for:

The OpenShift cluster becomes a shared platform for both CPU-only and GPU-accelerated jobs, enabling multi-tenant HPC use without exposing low-level node details.

GPU Hardware and Node Roles

GPU-Enabled Node Types

In OpenShift, GPUs are typically confined to specific node roles:

Control plane nodes usually do not host GPUs. All GPU configuration happens on worker nodes or dedicated accelerator nodes.

Node Features and Labelling

To make GPUs schedulable and discoverable:

Node labeling can be automated using Node Feature Discovery (NFD) or done manually where appropriate.

OpenShift GPU Software Stack

NVIDIA GPU Operator (Typical Stack Example)

On many OpenShift clusters, GPU support is delivered through the NVIDIA GPU Operator, managed via the Operator Lifecycle Manager (OLM). Key components installed on GPU nodes include:

While concrete installation steps belong to other chapters, it’s important from an HPC perspective that this stack is automated and reproducible at cluster scale.

Extended Resources: `nvidia.com/gpu` and Friends

GPU resources are exposed as extended resources rather than regular compute resources:

The scheduler matches these requests against node allocations, just like CPU and memory, but with discrete GPU counts.

Scheduling GPU Workloads

Basic Pod Specification for GPU Access

A minimal pod that requests GPUs typically looks like:

apiVersion: v1
kind: Pod
metadata:
  name: gpu-example
spec:
  containers:
  - name: gpu-container
    image: nvcr.io/nvidia/cuda:12.2.0-base-ubi8
    resources:
      requests:
        cpu: "2"
        memory: "8Gi"
        nvidia.com/gpu: "1"
      limits:
        cpu: "4"
        memory: "16Gi"
        nvidia.com/gpu: "1"
  restartPolicy: Never

Key aspects specific to GPUs:

Placement and Affinity

From an HPC perspective, placement strategies are important to maximize performance:

    nodeSelector:
      gpu: "true"
      gpu-type: "a100"

In HPC-style clusters, performance testing is often used to derive best-practice placement policies per application class.

Multi-GPU and Distributed Jobs

Multiple GPUs in a single pod or across multiple pods introduce additional concerns:

    resources:
      requests:
        nvidia.com/gpu: "4"

Distributed HPC or DL training is usually coordinated by higher-level controllers (e.g. Job, MPIJob, or custom operators) rather than individual pods.

Building GPU-Enabled Container Images

Base Images and Libraries

For GPU workloads on OpenShift:

A typical Dockerfile might be:

FROM nvcr.io/nvidia/cuda:12.2.0-runtime-ubi8
RUN microdnf install -y python39 && microdnf clean all
RUN pip3 install --no-cache-dir torch torchvision
COPY train.py /workspace/train.py
WORKDIR /workspace
CMD ["python3", "train.py"]

The GPU operator ensures that the container sees the correct host driver components; the application container only needs the user-space libraries.

Reproducibility and Performance in HPC

For HPC environments:

Container build pipelines (CI/CD) and image promotion flows (dev → test → prod) should be adapted for GPU images, including size handling and testing with real GPU nodes.

Security and Access Control with GPUs

Multi-Tenancy Concerns

On a shared HPC/OpenShift cluster:

Quotas and Limits

Resource quotas can control GPU resource consumption:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: gpu-quota
  namespace: research-team-a
spec:
  hard:
    requests.nvidia.com/gpu: "8"
    limits.nvidia.com/gpu: "8"

In HPC environments, quota policies may mimic batch system allocations (e.g. number of GPUs per group, time-limited access windows), although enforcement of time-based policies often requires external tooling or schedulers integrated with OpenShift.

Performance, Tuning, and Monitoring

Performance Considerations

Key HPC-specific considerations for GPU performance on OpenShift:

Monitoring GPU Utilization

OpenShift’s monitoring stack, enhanced by the GPU operator, can provide:

These metrics are essential for HPC capacity management and for validating that scheduling and placement policies actually lead to good GPU utilization.

GPU Resource Management Strategies for HPC

Dedicated vs Shared GPU Nodes

Clusters must decide how to balance flexibility and predictability:

Priority classes, preemption policies, and node taints/tolerations can enforce these strategies:

  kubectl taint nodes gpu-node-1 gpu=true:NoSchedule

and allow GPU workloads to tolerate:

  tolerations:
  - key: "gpu"
    operator: "Equal"
    value: "true"
    effect: "NoSchedule"

Time-Sharing and Fractional Use

Standard Kubernetes/OpenShift exposes GPUs as integer resources. For finer granularity:

These strategies are particularly relevant when running many small inference or interactive workloads on HPC clusters.

Integration with HPC and Batch Workflows

While detailed MPI and batch scheduling integration is discussed elsewhere, GPU-specific angles include:

HPC operators often combine OpenShift’s flexibility with traditional batch semantics by layering job management on top of the platform, especially for large-scale GPU campaigns.

Operational Considerations and Troubleshooting

Common Operational Challenges

In GPU-enabled OpenShift clusters, typical issues include:

Operational practices often include:

Practical Debugging Steps

For GPU workload issues:

  1. Check pod scheduling status
    • Why is a pod Pending? Inspect events for unschedulable reasons.
  2. Verify node resources
    • Confirm that nodes advertise nvidia.com/gpu and available counts.
  3. Inspect container environment
    • Inside a running pod:
      • Run nvidia-smi (if available) to confirm device visibility.
      • Check CUDA versions and driver compatibility.
  4. Look at operator and daemonset logs
    • GPU operator pods, device plugin daemonsets, and node-level logs.
  5. Validate SCC and runtime
    • Ensure the pod is using the right service account and SCC.

In HPC settings, it’s recommended to maintain a simple, known-good “GPU diagnostics” pod spec that can be deployed quickly to validate node functionality.


This chapter focused on how GPUs are integrated into OpenShift from an HPC perspective: the software stack, resource model, scheduling, security, performance tuning, and operational practices. Other chapters address batch workloads, MPI, and higher-level workflow patterns that build on these GPU capabilities.

Views: 11

Comments

Please login to add a comment.

Don't have an account? Register now!