Table of Contents
Concept and Role of Persistent Volumes
In OpenShift, a PersistentVolume (PV) is a cluster-scoped storage resource that represents a piece of storage in the cluster. It is:
- Created and managed by cluster administrators (or by a storage provisioner/Operator), not by application developers.
- Independent from any particular Pod, Deployment, or Namespace.
- A bridge between underlying storage systems (NFS, iSCSI, cloud disks, CSI drivers, etc.) and Kubernetes/OpenShift workloads.
Conceptually, a PV is to storage what a Node is to compute: a resource that can be consumed but is not owned by a specific application.
Key properties of PVs:
- Cluster-wide object (not namespaced).
- Lifecycle separate from individual Pods.
- Backed by some physical or virtual storage.
- Bound later to a PersistentVolumeClaim (PVC) created by users.
PersistentVolume Object Structure
A PersistentVolume is defined as a Kubernetes resource with apiVersion, kind, metadata, spec, and status. The most important fields in spec are:
capacityaccessModesvolumeModestorageClassNamepersistentVolumeReclaimPolicymountOptions- Backend-specific configuration (e.g.,
nfs,hostPath,awsElasticBlockStore,csi, etc.)
Example (static NFS-backed PV):
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-nfs-example
spec:
capacity:
storage: 20Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs-manual
nfs:
server: nfs-server.example.com
path: /exports/dataThis illustrates that:
- The PV advertises 20Gi of writable storage.
- Multiple Pods can mount it (
ReadWriteMany). - It uses an NFS server as the underlying storage.
- It’s associated with a specific
storageClassNameso that PVCs can target it.
Access Modes
Access modes describe how a PV can be mounted by Pods. Common modes you’ll see in OpenShift:
ReadWriteOnce(RWO):
Mounted as read-write by a single node at a time (typical for block volumes like cloud disks).ReadOnlyMany(ROX):
Mounted read-only by many nodes (useful for shared reference data).ReadWriteMany(RWX):
Mounted as read-write by many nodes simultaneously (common with NFS, some CSI drivers, or shared file systems).
Which modes are available depends on the storage backend and driver. For example, many cloud block volumes only support ReadWriteOnce, whereas shared file systems or file-based CSI drivers may support ReadWriteMany.
Volume Modes: Filesystem vs Block
PVs can expose storage to Pods in two main modes:
volumeMode: Filesystem(default)- The volume is formatted with a filesystem and mounted into the Pod.
- Pods see a normal directory tree; most typical workloads use this.
volumeMode: Block- The volume is presented as a raw block device.
- Advanced workloads (databases, some HPC or storage engines) can manage their own filesystem or data layout on top.
Example snippet:
volumeMode: Block
accessModes:
- ReadWriteOnce
capacity:
storage: 100Gi
In OpenShift, not every storage backend supports Block mode. You must ensure that the underlying storage and CSI driver support it.
Static vs Dynamic PV Provisioning
OpenShift clusters can expose PVs using two patterns:
Static Provisioning
With static provisioning, administrators pre-create PersistentVolumes that refer to existing storage.
Characteristics:
- Admin manually allocates and configures storage on the backend (e.g., creates an NFS export, a LUN, etc.).
- Admin creates a PV object pointing to that storage.
- Users create PVCs that match those PVs (by size, access mode,
storageClassName, etc.). - PVs are finite and explicitly managed; you see the specific PV names and their backing details.
Static provisioning is useful when:
- Storage comes from legacy systems or non-automated infrastructure.
- You need very specific or curated allocations (e.g., shared project storage, pre-populated datasets).
Dynamic Provisioning (via StorageClasses)
Dynamic provisioning is usually preferred in OpenShift:
- Users create PVCs requesting a size and a
storageClassName. - A provisioner (often via a CSI driver or built-in plugin) automatically creates the underlying storage and the PV.
- Users do not need to know backend details like NFS paths or device names.
In dynamic provisioning:
- PVs are created on demand.
- PVs usually have opaque auto-generated names.
- The lifecycle of the storage is tightly coupled with PVC and the reclaim policy.
From the PV perspective, dynamic provisioning means:
- The PV is created automatically after a PVC appears.
- The PV’s spec is generated based on the StorageClass parameters.
Binding: Matching PVs and PVCs
PersistentVolumes are not used directly by Pods. Instead, Pods use PVCs, which the cluster matches to PVs. The binding is based on:
storageClassName: Must match between PV and PVC (or both left empty).accessModes: PV must support all requested modes.capacity: PV capacity must be >= PVC requested size.volumeMode: Must match.- Any additional selector constraints on the PVC (labels, etc.).
Binding behavior:
- A PV can be bound to exactly one PVC at a time.
- Once bound, the PV’s
claimRefpoints to the PVC. - The binding is typically one-way: a PV is not re-bound to another PVC unless it is first released and becomes
Availableagain.
Administrators should ensure there is a reasonable pool or mechanism (dynamic provisioning) so legitimate PVCs find matching PVs.
Reclaim Policies
The persistentVolumeReclaimPolicy determines what happens to the underlying storage when the PVC is deleted. Common policies:
Delete- When the PVC is deleted, the PV and the underlying storage asset are removed.
- Typical for dynamically provisioned cloud volumes or CSI-managed disks.
- Simpler lifecycle: storage is cleaned up automatically.
Retain- When the PVC is deleted, the PV moves into
Releasedstate. - The underlying storage is not automatically deleted.
- Administrators must manually handle the reuse or deletion of the actual storage (and often the PV object).
- Useful to prevent accidental data loss or for data archival.
Recycle(legacy)- Previously supported, where volumes would be scrubbed and made available again.
- Deprecated and generally not used in modern OpenShift setups.
Choosing a policy:
- For user-managed or critical data where accidental deletion is risky,
Retainis safer. - For ephemeral or non-critical data,
Deleteprovides better automation and fewer “orphaned” resources.
Lifecycle States of a PersistentVolume
PVs move through several phases:
Available:
The PV is ready and has not yet been bound to a claim.Bound:
The PV is attached to a specific PVC. Only that PVC can use it.Released:
The PVC has been deleted, but the PV’s underlying storage may still contain data.- With
Retain, this is a signal to administrators to clean up or reassign. - With
Delete, the storage is often already removed, and PV may disappear. Failed:
The PV has encountered an error and is not usable in its current state. Admin action is required.
Understanding these phases helps operators track storage usage and cleanup needs in an OpenShift cluster.
Common PersistentVolume Backends in OpenShift
While the exact options depend on your environment and installed Operators/CSI drivers, typical PV backends include:
- NFS:
- Simple and widely used shared filesystem.
- Often supports
ReadWriteMany. - Defined in PVs via
nfs:configuration. - Cloud provider volumes (e.g., AWS EBS, Azure Disk, GCE PD):
- Usually block storage; often
ReadWriteOnce. - Typically provisioned dynamically via StorageClasses and CSI drivers.
- Shared filesystems and appliances (e.g., CephFS, GlusterFS legacy, vendor NAS systems):
- Often provide
ReadWriteManyand large capacity. - Common in enterprise OpenShift deployments via CSI Operators.
- Local volumes:
- Disk or directory on a specific node.
- Can offer high performance but are node-bound; failure or eviction of that node can impact availability.
- Specialized CSI drivers:
- For backup appliances, object gateways (via translation layers), or HPC file systems.
From the PV perspective, each backend is configured in a backend-specific stanza (nfs, csi, etc.), but all appear as PersistentVolume objects to users and workloads.
Security and Access Control Considerations
While detailed security topics are covered elsewhere, a few PV-specific concerns matter in OpenShift:
- File permissions and ownership:
- PVs often preserve permissions across Pods.
- Some storage providers integrate with OpenShift’s Security Context Constraints (SCCs) to adjust ownership automatically.
- Misconfigured permissions can prevent containers from writing to the volume.
- Shared volumes:
- RWX volumes mean multiple Pods (often from different Deployments) can write simultaneously.
- Misuse can lead to data corruption or accidental overwrites without additional coordination at the application layer.
- Network-attached storage:
- When using NFS or similar, network and firewall configuration affects availability and performance.
- Export-level permissions (e.g., allowed clients, root squash) must align with cluster needs.
Administrators should verify that PV configuration matches both security policies and workload requirements.
Operational Practices for Persistent Volumes
From an OpenShift operations perspective, some recurring tasks around PVs include:
- Capacity planning
- Monitoring how many PVs are allocated and how much space they represent on backend systems.
- Ensuring StorageClasses and provisioners have enough capacity to serve new PVCs.
- Cleaning up Released volumes
- For
Retainpolicy, periodically identifyingReleasedPVs and confirming whether the backing storage should be: - Reclaimed (securely wiped and reused).
- Archived.
- Permanently deleted.
- Standardizing with StorageClasses
- Using StorageClasses with clear naming (e.g.,
fast,standard,archive) so PVs and their backing capabilities (performance, redundancy) are predictable. - Ensuring dynamically provisioned PVs inherit appropriate reclaim policies and parameters.
- Troubleshooting binding issues
- When PVCs remain in
Pending, verifying: - That PVs with matching
storageClassNameand sufficient capacity exist or can be dynamically created. - That access modes and volume modes align.
- That any label selectors are correct.
By treating PVs as managed infrastructure resources rather than ad hoc directories or disks, OpenShift provides a predictable and auditable storage layer for stateful applications.
Summary
PersistentVolumes in OpenShift:
- Represent cluster-level storage resources, independent of Pods and Namespaces.
- Provide a unified abstraction over many different storage backends.
- Are matched to user-facing PersistentVolumeClaims through access modes, capacity, volume mode, and storage classes.
- Have reclaim policies that control the fate of the underlying storage after PVC deletion.
- Require thoughtful operational management for capacity, security, and lifecycle.
Subsequent chapters will focus on how applications request these PVs using PersistentVolumeClaims and how StorageClasses and dynamic provisioning simplify day-to-day usage.