Table of Contents
How OpenShift Thinks About Storage
OpenShift builds on Kubernetes’ storage primitives but adds opinionated defaults, integrations, and automation to make storage usable and manageable at cluster scale. The storage model is about how OpenShift presents, provisions, and attaches storage to applications running in pods.
At a high level, OpenShift’s storage model is:
- Kubernetes-native: Uses PersistentVolumes (PVs), PersistentVolumeClaims (PVCs), StorageClasses, etc.
- Pluggable: Supports many backends (cloud block/file, on‑prem SAN/NAS, software-defined storage).
- Dynamic: Favors dynamic provisioning over static, manual volume creation.
- Policy-driven: Uses StorageClasses and access modes to control behavior and guarantees.
- Tenant-aware: Plays nicely with OpenShift’s multi-tenant and security model.
This chapter focuses on how all of these pieces hang together conceptually in OpenShift, not the low-level details of each resource (covered in storage-focused chapters).
Storage Layers in OpenShift
Conceptually, OpenShift’s storage model has several layers:
- Application layer (pods, Deployments, StatefulSets)
- Kubernetes storage abstractions (PVCs, PVs, StorageClasses, volume mounts)
- OpenShift storage operators and integrations
- Underlying storage backends (cloud, on-prem, software-defined)
You typically interact with layers 1–3. Layer 4 is implemented by infrastructure/service teams or cloud providers.
1. Application Layer
Applications in OpenShift never talk directly to storage backends. They see:
- Volumes mounted into containers at paths like
/data,/var/lib/app, etc. - Claims (
PersistentVolumeClaimobjects) in their pod specs that request storage.
A pod spec refers to a PVC by name. The PVC abstracts away where the data lives and how it is provisioned.
Key consequences:
- The same deployment manifest can run on different clusters with different storage backends, as long as a compatible StorageClass exists.
- Application teams think in terms of “I need X Gi of storage with these properties”, not in terms of LUNs, NFS exports, or disks.
2. Kubernetes Storage Abstractions as Used by OpenShift
OpenShift fully uses and supports the standard Kubernetes storage API objects, but with some specific behaviors and conventions.
PersistentVolumes (PVs) in the OpenShift Model
- PVs represent actual storage resources: a slice of a SAN LUN, a cloud disk, an NFS share, an object from software-defined storage, etc.
- Administrators can:
- Create PVs manually (static provisioning)
- Let them be created automatically (dynamic provisioning via StorageClasses)
In practice, modern OpenShift deployments rely heavily on dynamic PV provisioning. Static PVs are used for integration with legacy or special-purpose storage.
PersistentVolumeClaims (PVCs) as the Main User Interface
- PVCs are how applications request storage.
- A PVC states:
- Requested capacity: e.g.,
10Gi - Access mode(s):
ReadWriteOnce(RWO),ReadWriteMany(RWX),ReadOnlyMany(ROX) - StorageClass: which policy/backend to use (or none for default behavior)
In OpenShift:
- Workloads in projects (namespaces) create PVCs.
- The cluster tries to bind each PVC to a matching PV.
- If a matching PV does not exist and a StorageClass with dynamic provisioning is referenced, OpenShift’s provisioner creates a new PV automatically.
The PVC/PV binding happens once per claim; the bound PV is then used by one or more pods according to its capabilities and access mode.
StorageClasses as Policy and Backend Encapsulation
StorageClasses define:
- Which backend to use (e.g., AWS EBS, VMware vSphere, OpenShift Data Foundation, NFS, etc.).
- Provisioner plugin to call (CSI driver name).
- Default parameters:
- Performance tier (SSD vs HDD)
- Replication policy
- Filesystem type (if applicable)
- Thin/thick provisioning
- Reclaim policy: what happens to PVs after a PVC is deleted (
DeletevsRetainvsRecyclewhere supported). - Volume binding mode: when the underlying volume is actually created (immediately vs delayed until scheduled).
In OpenShift:
- Administrators set up one or more cluster-wide StorageClasses.
- One StorageClass is usually annotated as the default, so PVCs that omit
storageClassNamestill get dynamically provisioned. - Application teams typically only need to know:
- The name(s) of StorageClasses they are allowed/recommended to use.
- Which StorageClass fits their performance and access requirements.
Storage Backends in OpenShift
OpenShift is designed to run on many environments, so its storage model must be agnostic to specific technology. It achieves this with a plugin architecture, now standardized on the Container Storage Interface (CSI).
Cloud-Provider Storage
When OpenShift runs on a public cloud (e.g., AWS, Azure, GCP), there are typically built-in StorageClasses for:
- Block storage (EBS, Azure Disk, Persistent Disk)
- File storage (EFS, Azure Files, Filestore) for RWX scenarios, often via CSI drivers
Characteristics in this context:
- Volumes are usually zoned: tied to a specific availability zone; pods using them must schedule there.
- Performance and cost characteristics are selected via StorageClass parameters (e.g.,
gp3vsio1).
OpenShift integrates tightly with cloud provider CSI drivers to let PVCs automatically create cloud volumes and attach/detach them to nodes.
On-Prem and Virtualized Environments
In on-premise or virtualized deployments (bare metal, VMware, etc.), OpenShift relies on:
- SAN/NAS systems exposed via CSI drivers.
- NFS or SMB as file-based storage for RWX when supported.
- OpenShift Data Foundation (a Red Hat software-defined storage solution) as a common integrated option.
StorageClasses in these environments encapsulate details such as pool IDs, replication factor, and storage arrays.
Software-Defined Storage and OpenShift Data Foundation (ODF)
In many OpenShift deployments, software-defined storage is used to provide:
- Cluster-wide storage across nodes.
- Integrated management, encryption, multi-tenancy, and metrics.
OpenShift Data Foundation (ODF) is Red Hat’s integrated option that:
- Runs as Operators inside the cluster.
- Exposes RWO and RWX StorageClasses.
- Handles replication, failure domains, and scaling behind the scenes.
From the cluster user perspective, ODF is just another set of StorageClasses with specific capabilities.
Volume Types and Access Patterns
OpenShift’s storage model must support different workloads with different patterns of access, consistency, and performance requirements.
Ephemeral vs Persistent Storage
OpenShift distinguishes between:
- Ephemeral volumes:
- Tied to the lifecycle of a pod.
- Data is lost when the pod is deleted or rescheduled.
- Examples:
emptyDir,configMap,secret, CSI ephemeral volumes. - Useful for caches, temporary working data, and stateless workloads.
- Persistent volumes:
- Outlive individual pods.
- Bound via PVCs/PVs.
- Required for databases, file repositories, message queues, etc.
The storage model encourages:
- Stateless by default: Use ephemeral storage when you can.
- Persistent where necessary: Introduce PV-backed storage when application semantics require it.
Access Modes and Their Implications
Common access modes in OpenShift:
ReadWriteOnce(RWO): Mounted as read-write by a single node at a time.ReadWriteMany(RWX): Mounted as read-write by many nodes simultaneously.ReadOnlyMany(ROX): Mounted as read-only by many nodes.
Implications in practice:
- RWO volumes are typical for:
- Single-instance databases.
- Stateful services not designed for multi-writer concurrency.
- RWX volumes are typical for:
- Shared file repositories.
- Some legacy apps expecting shared network file systems.
- Multi-replica workloads sharing common writable state (if the application can handle it).
In OpenShift, whether a particular StorageClass supports RWO, RWX, or ROX depends on the underlying backend and the CSI driver’s capabilities.
Dynamic Provisioning and Lifecycle
Dynamic provisioning is central to OpenShift’s storage model: storage is allocated on demand when applications need it.
Binding Workflow
Typical lifecycle:
- An application creates a PVC (manually or via a higher-level resource).
- Kubernetes/OpenShift checks if there is:
- A matching pre-existing PV (static provisioning), or
- A StorageClass that can provision a new PV.
- If using a StorageClass:
- The CSI provisioner is invoked to create a new backend volume.
- A PV object is created and bound to the PVC.
- A pod using that PVC is scheduled:
- The volume is attached to the node (if needed).
- The volume is mounted into the container at the requested path.
Deletion lifecycle:
- When a PVC is deleted, the PV’s reclaim policy determines what happens:
Delete: The underlying storage (e.g., cloud disk, volume on SDS) is removed.Retain: The PV remains and the data stays intact; an admin must manually handle it.- (Older
Recyclebehavior is generally deprecated and not used in modern setups.)
In OpenShift, default StorageClasses usually use Delete, but storage administrators might choose different policies for compliance or data retention.
Delayed Binding and Topology Awareness
In multi-zone/multi-region setups, it is important that storage:
- Resides in the same zone as the pods using it.
- Takes topology constraints into account.
StorageClasses can specify a volume binding mode:
Immediate: Volume is provisioned as soon as the PVC is created.WaitForFirstConsumer: Volume is provisioned only when a pod using the PVC is scheduled, so the provisioner knows where (zone/region) to create it.
OpenShift clusters often use WaitForFirstConsumer for block storage on public clouds to avoid cross-zone issues.
Multi-Tenancy, Security, and Isolation
OpenShift’s security and multi-tenancy model influences how storage is provided and used.
Namespaces and Storage Visibility
Key points:
- PVCs are namespaced; PVs are cluster-scoped.
- A PV can only be bound to a PVC in one namespace at a time.
- Multiple pods across the same namespace (or across namespaces, when allowed) can share a RWX volume via the same PVC or via separate PVCs bound to the same PV (depending on configuration).
This model allows:
- Isolation of applications via namespaces.
- Central management of storage (PVs, StorageClasses) by cluster admins.
Security Context Constraints (SCCs) and Filesystem Access
OpenShift uses SCCs to:
- Restrict pod privileges.
- Control user and group IDs under which containers run.
- Guard against unsafe hostPath mounts and privileged access.
Effects on storage:
- Filesystem permissions and ownership on PVs must align with the UID/GID strategy enforced by SCCs.
- Some storage setups require:
fsGroupconfigurations in pod specs.- Supplemental groups for shared volumes.
- Pre-initialization or customization of the volume’s permissions.
Cluster admins and storage admins need to ensure that StorageClasses and storage backends work with OpenShift’s security defaults (e.g., non-root containers).
Operator-Driven Storage Management
OpenShift heavily uses Operators to manage platform components, and storage is no exception.
CSI Drivers as Operators
Many CSI drivers are installed and managed via:
- The Operator Lifecycle Manager (OLM).
- Vendor-provided Operators (e.g., cloud provider CSI Operators, storage vendor Operators).
These Operators:
- Deploy and configure the necessary CSI components:
- Controller pods for provisioning and attaching volumes.
- Node daemons for mounting and format operations.
- Create and manage StorageClasses appropriate for the backend.
- Manage versioning, upgrades, and health of the storage integration.
From the cluster user’s perspective, once the Operator is installed, they simply see additional StorageClasses and everything “just works.”
OpenShift Data Foundation Operator
When using ODF:
- An Operator (or set of Operators) deploys storage services across worker nodes.
- StorageClasses are created that:
- Point to the ODF-backed storage.
- Provide different access modes and performance profiles.
- ODF handles replication, failures, scaling, and data placement according to policies.
Administrators interact with the Operator for capacity expansion, upgrades, and monitoring. Application teams continue to work only with PVCs and StorageClasses.
Storage Model Considerations for Workload Types
While detailed stateful application patterns are covered elsewhere, the storage model provides mechanisms for several broad workload categories.
Stateless Applications
- Prefer ephemeral volumes (
emptyDir, projected volumes). - Keep any persistent state outside of the pods:
- External databases
- Object storage
- External services
OpenShift’s storage model supports this by making ephemeral options simple and default-friendly.
Stateful Applications
For workloads that require data persistence:
- Use PVC-backed volumes.
- Often rely on specific Kubernetes primitives (e.g., StatefulSets) that orchestrate:
- Stable identities for pods.
- Stable volume-to-pod mappings.
OpenShift’s storage model ensures that when a pod is rescheduled:
- The same PV is re-attached to a new node (subject to access mode and topology).
- Data persists as expected.
Shared-Data Applications
Some workloads need shared writable storage:
- Multiple replicas writing to an RWX volume.
- Legacy applications expecting a shared NFS-like directory.
OpenShift’s storage model supports these via:
- RWX-capable StorageClasses and CSI drivers.
- Possibly dedicated file or object-storage integrations.
Admins must choose storage backends that implement correct semantics for concurrent access to avoid data corruption.
Observability and Operations for Storage
Operators and administrators need to monitor and operate storage at scale. While detailed observability topics are covered elsewhere, the storage model includes some operational expectations.
Capacity and Quotas
OpenShift integrates with:
- Resource quotas at the namespace level, which can include:
- Limits on total requested storage across PVCs.
- Cluster-level storage capacity reporting:
- CSI drivers can advertise available capacity.
- ODF and other storage systems expose metrics and dashboards.
The model encourages:
- Avoiding “unbounded PVC growth” via quotas and policies.
- Ensuring that StorageClasses represent realistic capacity and performance.
Backup, Restore, and Migration
The storage model itself doesn’t define backup/restore, but influences how such tools work:
- PVCs and PVs are the units of data to back up or snapshot.
- CSI drivers may provide:
- Snapshots of volumes.
- Clone operations between volumes.
- Higher-level tools (Operators or external systems) orchestrate:
- Application-consistent backups.
- Restore and migration across clusters or environments.
OpenShift’s role is to provide consistent Kubernetes storage semantics so such tools can operate predictably.
Summary of the OpenShift Storage Model
- OpenShift uses standard Kubernetes storage concepts but integrates them deeply with:
- CSI-based drivers
- Operators for storage backends
- Multi-tenant security and policy
- Applications request storage via PVCs, not by talking directly to storage systems.
- StorageClasses encapsulate storage backends and policies, enabling dynamic provisioning.
- The model is flexible enough to support:
- Public cloud, on-prem, and hybrid environments.
- Ephemeral, persistent, and shared storage patterns.
- Operators and CSI drivers are central to provisioning, attaching, and managing volumes in a way that is consistent with OpenShift’s overall architecture.
Subsequent storage-focused chapters zoom into individual components such as PersistentVolumes, PersistentVolumeClaims, StorageClasses, and how to design storage for stateful and specialized workloads.