Kahibaro
Discord Login Register

4.6 OpenShift storage model

How OpenShift Thinks About Storage

OpenShift builds on Kubernetes’ storage primitives but adds opinionated defaults, integrations, and automation to make storage usable and manageable at cluster scale. The storage model is about how OpenShift presents, provisions, and attaches storage to applications running in pods.

At a high level, OpenShift’s storage model is:

This chapter focuses on how all of these pieces hang together conceptually in OpenShift, not the low-level details of each resource (covered in storage-focused chapters).

Storage Layers in OpenShift

Conceptually, OpenShift’s storage model has several layers:

  1. Application layer (pods, Deployments, StatefulSets)
  2. Kubernetes storage abstractions (PVCs, PVs, StorageClasses, volume mounts)
  3. OpenShift storage operators and integrations
  4. Underlying storage backends (cloud, on-prem, software-defined)

You typically interact with layers 1–3. Layer 4 is implemented by infrastructure/service teams or cloud providers.

1. Application Layer

Applications in OpenShift never talk directly to storage backends. They see:

A pod spec refers to a PVC by name. The PVC abstracts away where the data lives and how it is provisioned.

Key consequences:

2. Kubernetes Storage Abstractions as Used by OpenShift

OpenShift fully uses and supports the standard Kubernetes storage API objects, but with some specific behaviors and conventions.

PersistentVolumes (PVs) in the OpenShift Model

In practice, modern OpenShift deployments rely heavily on dynamic PV provisioning. Static PVs are used for integration with legacy or special-purpose storage.

PersistentVolumeClaims (PVCs) as the Main User Interface

In OpenShift:

The PVC/PV binding happens once per claim; the bound PV is then used by one or more pods according to its capabilities and access mode.

StorageClasses as Policy and Backend Encapsulation

StorageClasses define:

In OpenShift:

Storage Backends in OpenShift

OpenShift is designed to run on many environments, so its storage model must be agnostic to specific technology. It achieves this with a plugin architecture, now standardized on the Container Storage Interface (CSI).

Cloud-Provider Storage

When OpenShift runs on a public cloud (e.g., AWS, Azure, GCP), there are typically built-in StorageClasses for:

Characteristics in this context:

OpenShift integrates tightly with cloud provider CSI drivers to let PVCs automatically create cloud volumes and attach/detach them to nodes.

On-Prem and Virtualized Environments

In on-premise or virtualized deployments (bare metal, VMware, etc.), OpenShift relies on:

StorageClasses in these environments encapsulate details such as pool IDs, replication factor, and storage arrays.

Software-Defined Storage and OpenShift Data Foundation (ODF)

In many OpenShift deployments, software-defined storage is used to provide:

OpenShift Data Foundation (ODF) is Red Hat’s integrated option that:

From the cluster user perspective, ODF is just another set of StorageClasses with specific capabilities.

Volume Types and Access Patterns

OpenShift’s storage model must support different workloads with different patterns of access, consistency, and performance requirements.

Ephemeral vs Persistent Storage

OpenShift distinguishes between:

The storage model encourages:

Access Modes and Their Implications

Common access modes in OpenShift:

Implications in practice:

In OpenShift, whether a particular StorageClass supports RWO, RWX, or ROX depends on the underlying backend and the CSI driver’s capabilities.

Dynamic Provisioning and Lifecycle

Dynamic provisioning is central to OpenShift’s storage model: storage is allocated on demand when applications need it.

Binding Workflow

Typical lifecycle:

  1. An application creates a PVC (manually or via a higher-level resource).
  2. Kubernetes/OpenShift checks if there is:
    • A matching pre-existing PV (static provisioning), or
    • A StorageClass that can provision a new PV.
  3. If using a StorageClass:
    • The CSI provisioner is invoked to create a new backend volume.
    • A PV object is created and bound to the PVC.
  4. A pod using that PVC is scheduled:
    • The volume is attached to the node (if needed).
    • The volume is mounted into the container at the requested path.

Deletion lifecycle:

In OpenShift, default StorageClasses usually use Delete, but storage administrators might choose different policies for compliance or data retention.

Delayed Binding and Topology Awareness

In multi-zone/multi-region setups, it is important that storage:

StorageClasses can specify a volume binding mode:

OpenShift clusters often use WaitForFirstConsumer for block storage on public clouds to avoid cross-zone issues.

Multi-Tenancy, Security, and Isolation

OpenShift’s security and multi-tenancy model influences how storage is provided and used.

Namespaces and Storage Visibility

Key points:

This model allows:

Security Context Constraints (SCCs) and Filesystem Access

OpenShift uses SCCs to:

Effects on storage:

Cluster admins and storage admins need to ensure that StorageClasses and storage backends work with OpenShift’s security defaults (e.g., non-root containers).

Operator-Driven Storage Management

OpenShift heavily uses Operators to manage platform components, and storage is no exception.

CSI Drivers as Operators

Many CSI drivers are installed and managed via:

These Operators:

From the cluster user’s perspective, once the Operator is installed, they simply see additional StorageClasses and everything “just works.”

OpenShift Data Foundation Operator

When using ODF:

Administrators interact with the Operator for capacity expansion, upgrades, and monitoring. Application teams continue to work only with PVCs and StorageClasses.

Storage Model Considerations for Workload Types

While detailed stateful application patterns are covered elsewhere, the storage model provides mechanisms for several broad workload categories.

Stateless Applications

OpenShift’s storage model supports this by making ephemeral options simple and default-friendly.

Stateful Applications

For workloads that require data persistence:

OpenShift’s storage model ensures that when a pod is rescheduled:

Shared-Data Applications

Some workloads need shared writable storage:

OpenShift’s storage model supports these via:

Admins must choose storage backends that implement correct semantics for concurrent access to avoid data corruption.

Observability and Operations for Storage

Operators and administrators need to monitor and operate storage at scale. While detailed observability topics are covered elsewhere, the storage model includes some operational expectations.

Capacity and Quotas

OpenShift integrates with:

The model encourages:

Backup, Restore, and Migration

The storage model itself doesn’t define backup/restore, but influences how such tools work:

OpenShift’s role is to provide consistent Kubernetes storage semantics so such tools can operate predictably.

Summary of the OpenShift Storage Model

Subsequent storage-focused chapters zoom into individual components such as PersistentVolumes, PersistentVolumeClaims, StorageClasses, and how to design storage for stateful and specialized workloads.

Views: 124

Comments

Please login to add a comment.

Don't have an account? Register now!