15.4 Common platform Operators

Table of Contents

Why Common Platform Operators Matter

In OpenShift, many “day 2” operations (installing, configuring, and keeping core services updated) are handled by Red Hat–provided Operators. These are not just add‑ons: they form the backbone of the platform’s functionality and user experience.

This chapter focuses on the most commonly encountered platform Operators you’ll see in a typical OpenShift cluster, what they are responsible for, and how they affect application teams and cluster administrators.

You do not need to memorize every Operator name; instead, focus on:

What area of the platform each Operator family covers.
What kind of problems it solves or automates.
Typical interactions and high‑level workflows.

Cluster Version and Cluster Infrastructure Operators

These Operators are responsible for the overall health and lifecycle of the cluster itself.

Cluster Version Operator (CVO)

The CVO is the “conductor” for OpenShift cluster upgrades and component versions.

Key responsibilities:

Manages the cluster’s desired version (e.g., 4.16.15).
Continuously reconciles the cluster to match the release image you selected.
Ensures all cluster Operators that are part of the core platform match the expected versions and manifests.

Practical implications:

Cluster admins upgrade OpenShift by adjusting the “desired” version; the CVO orchestrates the rest.
If a core component fails to upgrade or drifts from its expected configuration, CVO reports the issue and attempts remediation.
CVO relies on the Operator Lifecycle Manager and individual cluster Operators to converge to the desired state.

Typical interactions:

Viewing status with oc (for illustration; detailed CLI usage is covered elsewhere):

  oc get clusterversion
  oc describe clusterversion version

Using the web console’s “Cluster Settings” page to view and initiate upgrades.

Cluster Operators vs. Application Operators

Many of the “common platform Operators” you’ll see are Cluster Operators—they are part of the core platform and are managed by the CVO.

Characteristics of Cluster Operators:

Names like authentication, console, kube-apiserver, ingress, network, etc.
Report status conditions such as Available, Progressing, and Degraded.
Are not uninstalled in a supported way; they are integral to the platform.

Application or add‑on Operators, by contrast:

Are typically managed by OLM (Operator Lifecycle Manager).
Can be installed/uninstalled as optional components (e.g., logging stack, service mesh, database Operators).
Usually control a specific service or application, rather than core cluster behavior.

Core Control Plane and Infrastructure Operators

OpenShift decomposes much of the Kubernetes control plane and supporting platform components into Operators. While you won’t often configure these directly, understanding what they cover helps you troubleshoot and understand cluster behavior.

API Server and Authentication Operators

Kubernetes API Server Operator

Manages the configuration and deployment of the kube-apiserver pods that serve all Kubernetes API requests.
Applies configuration defined in cluster‑wide resources (for example, the cluster’s API endpoint configuration).
Ensures correct certificate rotation and rolling restarts when configuration changes.

Typical visibility:

You may see this Operator report Progressing during upgrades or configuration changes that affect the API server.
If the API server is unhealthy or frequently restarting, this Operator often shows Degraded with details.

Authentication Operator

Manages the cluster authentication stack:

OAuth server (for web console and oc login).
Identity providers (IdPs) configuration (e.g., LDAP, GitHub, OIDC).

Reconciles cluster‐wide OAuth configuration resources into actual deployments and routes.

For application teams:

Determines how users log in and how tokens are issued.
Changes to login mechanisms (e.g., adding SSO) typically go through this Operator via its configuration resources, not by editing deployments manually.

Ingress and Network Operators

Ingress Operator

Manages IngressController resources that define how external HTTP/HTTPS traffic reaches cluster Services.
Creates and maintains HAProxy‐based router pods and related Kubernetes Service/Route objects.
Handles certificate configuration for external routes, wildcard certificates, and load balancer integration where supported.

Common use cases:

Configuring custom domain names and TLS certificates for routes.
Adjusting the number of router replicas for availability and throughput.
Managing multiple ingress controllers (e.g., internal vs. external traffic).

Network Operator

Manages the cluster’s networking implementation:

SDN or CNI plugin (e.g., OpenShift SDN, OVN-Kubernetes).
Pod network CIDR and service network CIDR.
Cluster network policies and related components.

Responds to changes in high‑level network configuration objects.

What this means practically:

During installation and initial configuration, this Operator enforces the cluster’s chosen networking model.
Networking feature toggles or migration steps (for supported scenarios) go through this Operator.
Many cluster‑wide networking changes require coordinated updates, which this Operator orchestrates.

Machine and Node Management Operators

These Operators are heavily used in environments where OpenShift manages infrastructure (such as on major public clouds or bare metal with Machine API).

Machine API Operator

Provides the “Machine” abstraction, which represents cluster nodes at the infrastructure level.
Integrates with underlying infrastructure providers (AWS, Azure, GCP, vSphere, bare metal, etc.) to:

Create, update, and delete VMs or bare‑metal nodes.
Implement machine sets that behave similarly to node auto‑scaling groups.

Keeps the desired number and type of worker nodes in sync with MachineSet definitions.

Effects you’ll notice:

Adding worker capacity is often done by adjusting MachineSet replicas, not by provisioning servers manually.
Automated node replacement after failure is driven by this Operator.

Machine Config Operator (MCO)

Manages the operating system configuration of nodes using MachineConfig resources.
Applies OS‑level changes (kernel arguments, OS packages/features, configuration files) in a controlled rolling fashion.
Ensures worker and master nodes are configured consistently with the desired state defined at the cluster level.

Practical impact:

Instead of SSHing into nodes to configure system services or files, cluster admins define MachineConfig objects that the MCO applies.
Changes that require a node reboot (such as some kernel changes) are orchestrated with controlled rollouts to avoid downtime.

User-Facing Platform Services Operators

Some Operators directly impact how developers and platform users interact with the cluster.

Console Operator

Manages the OpenShift web console deployment and its configuration.
Handles:

Console route and TLS configuration.
Branding/customization (logos, links).
Integration points such as links to external tools or monitoring dashboards.

As an application developer:

The console Operator ensures your web console is available and up‑to‑date.
Changes to the console’s appearance or enabled features typically involve editing console configuration resources, not the underlying deployments.

Cluster Storage and Registry Operators

Cluster Storage Operator

Coordinates default storage components and ensures required storage Operators are deployed.
On some platforms, this Operator manages or integrates with:

Default storage classes.
Cluster‑provided storage backends for general workloads.

Acts as an umbrella for more specific storage providers that might be delivered as additional Operators.

In practice:

Determines what StorageClass objects are available by default.
Can expose configuration knobs (via custom resources) that affect how default storage is provisioned and consumed.

Cluster Image Registry Operator

Manages the internal container image registry that OpenShift provides by default.
Controls:

Storage backend configuration for the registry (e.g., object storage, filesystem).
Number of registry replicas.
Route exposure for internal and sometimes external access.

Implications for developers:

The internal registry is often the default location for build outputs (e.g., Source-to-Image builds).
Configuration changes, such as switching to a different storage backend for the registry, are handled by adjusting the registry’s custom resource, not by patching deployments directly.

Observability and Logging Operators

Several common Operators manage the cluster’s monitoring and logging subsystems. Some are cluster Operators; others may be add‑on Operators installed via OLM.

Monitoring Stack Operators

OpenShift’s built‑in monitoring is typically managed by a set of Operators that:

Deploy and configure:

Prometheus and Alertmanager for metrics and alerts.
Thanos or related components for long‑term metrics storage (where applicable).

Reconcile cluster monitoring configuration to:

Control what namespaces are monitored.
Set retention and resource usage limits for system monitoring.

Provide metrics and alerts both for platform components and optionally user workloads.

Practical considerations:

Platform metrics used by horizontal pod autoscaling, cluster health dashboards, and alerts are under the control of these Operators.
Changes to monitoring scope or retention are typically expressed via their configuration custom resources, not by editing Prometheus deployments.

Logging Stack Operators

Depending on the OpenShift version and chosen stack, logging may be managed by Operators such as:

An Operator that deploys log collectors (e.g., Fluentd/Vector).
An Operator that manages log storage and visualization stack (e.g., Elasticsearch or Loki, plus Kibana/Grafana).

Common responsibilities:

Ensuring log collectors are present on the right nodes (via DaemonSets).
Managing indices or log streams, retention policies, and resource settings for the logging backend.
Providing central configuration for which logs are collected and where they are forwarded (e.g., external SIEM, cloud log service).

From a platform user’s perspective:

Cluster logs and application logs become available in a centralized interface.
You or the admin configure log routing via CRDs exposed by the logging Operator, rather than directly configuring the collectors.

Storage and Data Services Operators

Beyond the core storage Operator, many clusters deploy additional data‑related Operators to deliver persistent storage and database‑like services.

OpenShift Data Foundation (ODF) / Similar Storage Operators

These Operators:

Provide a software‑defined storage solution (block, file, and object) tightly integrated with OpenShift.
Expose storage functionality via custom resources:

Defining storage clusters or data pools.
Managing replication, failure domains, and capacity expansion.

Automatically configure storage classes that application developers can reference via Persistent Volume Claims.

Usage patterns:

Admins define the desired storage topology and capacity.
Developers simply request persistent volumes via PVCs, with no need to understand underlying disks, RAIDs, or replication.

Database and Messaging Operators (Examples)

Common platform deployments often include Operators for:

Databases: PostgreSQL, MySQL/MariaDB, MongoDB, etc.
Messaging systems: Kafka, AMQP brokers.

While these are not core “must‑have” components like the API server or network Operators, they are very common in real clusters and follow similar patterns:

Expose CRDs to define database instances, clusters, topics, users, and security settings.
Automate:

Deployment and scaling.
Backup and restore primitives.
Upgrades and failover for stateful services.

These Operators provide standardized, Kubernetes‑native APIs for data services that many applications depend on.

Application Platform Add‑on Operators

There is a set of Operators that extend OpenShift into a richer application platform beyond just “vanilla” Kubernetes.

Service Mesh Operator

Deploys and configures an Istio‑based or similar service mesh stack.
Manages custom resources that define:

Service mesh control planes.
Member namespaces.
Traffic management policies (e.g., mTLS, retries, routing rules).

Impact on applications:

Enables advanced traffic control, observability, and security for microservices, using mesh‑specific CRDs instead of manual sidecar management.

Serverless / Knative Operator

Manages Knative components to enable serverless capabilities on OpenShift:

Scale‑to‑zero workloads.
Event‑driven functions.

Provides CRDs for functions/services that can autoscale based on request load.

Effect for developers:

Allows creating “serverless” applications through standard Kubernetes APIs extended by Knative CRDs.

Pipelines (Tekton) Operator

Deploys Tekton components as the foundation for OpenShift Pipelines.
Manages pipeline‑related CRDs (Pipelines, Tasks, PipelineRuns, TaskRuns, etc.).
Keeps the CI/CD pipeline engine updated and integrated with the cluster.

From a workflow perspective:

Application developers design CI/CD workflows using Kubernetes‑style resources that this Operator interprets and executes.

Typical Operational Interactions with Platform Operators

Even as a beginner, you’ll likely encounter platform Operators in a few recurring scenarios.

Checking Cluster Health via Cluster Operators

Platform health is often assessed by listing Cluster Operators:

oc get clusteroperators

You might see columns such as:

AVAILABLE
PROGRESSING
DEGRADED

Cluster admins use these to quickly identify which area of the platform is experiencing issues:

authentication=False, Degraded=True → trouble with logins or identity providers.
ingress=False, Progressing=True → routers being updated, or configuration rolling out.
machine-config=False, Degraded=True → node configuration changes failing.

Changing Platform Behavior Through CRDs

Each Operator usually exposes its own set of custom resources that you or admins can edit:

Ingress configuration via IngressController resources.
Registry configuration via configs.imageregistry.operator.openshift.io.
Monitoring configuration via ClusterMonitoring or similar CRDs.
Machine and node configuration via MachineConfig and MachineConfigPool.

The common pattern:

You adjust a high‑level configuration object.
The corresponding Operator reconciles the actual deployments, services, or nodes.
You confirm the change by checking both the relevant resources and the Operator status.

Understanding Operator Boundaries

When troubleshooting or planning changes, it helps to know which Operator “owns” which part of the cluster. A few mental mappings:

API, controllers, scheduler → control plane Operators.
Login, OAuth, IdPs → Authentication Operator.
External traffic, routes → Ingress Operator.
Pod networking, CNI → Network Operator.
Nodes, OS config → Machine API Operator and Machine Config Operator.
Web console → Console Operator.
Internal registry → Image Registry Operator.
Monitoring & logging → Monitoring and Logging Operators.
Specialized services (storage, service mesh, serverless, pipelines) → Their dedicated Operators.

Summary

Common platform Operators in OpenShift collectively:

Implement the core behavior and services of the platform.
Encapsulate complex operational tasks (install, upgrade, scale, reconfigure) behind Kubernetes‑native APIs.
Allow cluster administrators to manage infrastructure and platform services declaratively.
Provide application teams with stable, higher‑level abstractions for networking, storage, CI/CD, observability, and more.

As you work with OpenShift, recognizing which Operator is responsible for a given feature or subsystem helps you: