Table of Contents
Overview of the OpenShift Monitoring Stack
OpenShift ships with an opinionated, fully integrated monitoring stack built on top of popular open-source projects, primarily Prometheus and Alertmanager. This stack is:
- Installed and managed by Operators
- Split logically into cluster monitoring (platform) and user workload monitoring (applications)
- Configured declaratively via custom resources
- Designed to be upgradable and supported as part of the OpenShift platform
This chapter focuses on what is in that stack, how it is structured, and how you interact with it as a cluster admin and as an application developer.
Architecture and Components
The built-in monitoring stack is composed of multiple components running in specific namespaces, each with a distinct responsibility.
Core Components
Prometheus
Prometheus is the core time-series database and metrics scraper:
- Scrapes metrics from endpoints exposed over HTTP
- Stores metrics in a local time-series database
- Provides a query language (PromQL) used by the web console and other tools
OpenShift runs multiple Prometheus instances for different scopes:
- Cluster monitoring Prometheus in
openshift-monitoring - Monitors the OpenShift platform (API server, etcd, SDN, kubelet, nodes, etc.)
- User workload Prometheus in
openshift-user-workload-monitoring(optional, but enabled by default in recent versions) - Monitors metrics from user namespaces/applications
- Is tenant-aware: separates metrics per namespace and enforces RBAC
Alertmanager
Alertmanager:
- Receives alert notifications from Prometheus
- Handles:
- Grouping of alerts
- Inhibition (suppression) rules
- Notification routing (email, webhook, etc.)
OpenShift provides a managed Alertmanager instance:
- Configured via a
Secret(for routes and receivers) - Integrated with the cluster’s authentication for UI access
- Used by both cluster and user workload monitoring (with isolation rules)
Thanos Querier (or Aggregation Layer)
To provide a unified query entry point:
- OpenShift uses Thanos Querier (or a similar aggregation component, depending on version) to:
- Aggregate Prometheus instances
- Provide a single endpoint for queries
- Enforce RBAC for who can see which metrics
- The OpenShift web console talks to this aggregated endpoint, not directly to each Prometheus.
Metrics Collectors and Exporters
Several components expose or collect metrics:
- kube-state-metrics
- Exposes metrics based on Kubernetes objects’ state (Deployments, Pods, etc.)
- node-exporter
- Runs on each node, exposing host-level metrics (CPU, memory, disk, network)
- cAdvisor / kubelet metrics
- Container-level resource usage metrics (CPU, memory, filesystem, etc.)
- OpenShift component exporters
- API server, controllers, SDN, registry, etc. all expose metrics endpoints.
Namespaces and Logical Separation
The monitoring stack is split by namespaces, which also reflect responsibilities:
openshift-monitoring- Cluster (platform) monitoring components
- Prometheus, Alertmanager, Thanos-querier, kube-state-metrics, node-exporter, etc.
openshift-user-workload-monitoring- Optional monitoring for user workloads
- Separate Prometheus and Alertmanager configured to scrape user namespaces
- User namespaces
- Applications expose metrics endpoints
- You create
ServiceMonitor/PodMonitorobjects here when user workload monitoring is enabled.
This separation allows:
- Clear support boundaries (platform vs. user)
- Different resource constraints and retention periods
- Security and multi-tenancy for user metrics
Cluster vs User Workload Monitoring
The built-in monitoring stack is deliberately split into cluster monitoring and user workload monitoring. Their differences are important both operationally and for security.
Cluster Monitoring (Platform)
Cluster monitoring is:
- Enabled by default
- Fully managed by OpenShift
- Intended for monitoring:
- Control plane components
- Worker nodes and system services
- Cluster-level infrastructure
Key characteristics:
- Runs only in
openshift-monitoring - Configuration managed with the
ClusterMonitoringConfiguration(aConfigMap) - Metrics retention and storage are tuned for platform SLIs and SLOs
- Not meant for scraping arbitrary user applications
As an administrator, you typically:
- Adjust retention and resource usage
- Configure remote write (if supported for your version)
- Configure Alertmanager routes for platform alerts
- Use it to monitor cluster health and performance
User Workload Monitoring
User workload monitoring:
- Is separate from cluster monitoring
- Targets user namespaces and application-level metrics
- Is optional but recommended for integrating application metrics with the platform
When enabled:
- A separate Prometheus (and Alertmanager) stack is deployed in
openshift-user-workload-monitoring - You define
ServiceMonitororPodMonitorresources in user namespaces to specify what to scrape - RBAC ensures users only see metrics and alerts relevant to namespaces they can access
This separation allows:
- Independent scaling and retention settings for application metrics
- Isolation between tenants in multi-tenant clusters
- Reduced risk of user metrics impacting core platform monitoring
Configuration of the Built-in Monitoring Stack
The monitoring stack is managed primarily through custom resources and configuration objects that Operators reconcile. You do not manually edit the deployments.
Cluster Monitoring Configuration
Cluster-level configuration is done via a ConfigMap called cluster-monitoring-config in openshift-monitoring (exact name may vary slightly by version).
Typical structure (simplified):
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
prometheusK8s:
retention: 15d
resources:
requests:
memory: 2Gi
limits:
memory: 4Gi
alertmanagerMain:
resources:
requests:
memory: 512Mi
telemeterClient:
enabled: trueYou use this to:
- Set Prometheus retention (
retention) - Tune CPU/memory resource requests and limits
- Enable/disable specific components
- Configure storage settings (e.g., persistent volumes for metrics)
- Configure remote write to external systems (if supported by your version and policy)
The Cluster Monitoring Operator reconciles this config and adjusts deployments automatically.
User Workload Monitoring Configuration
User workload monitoring is configured via a ConfigMap in openshift-user-workload-monitoring, often called user-workload-monitoring-config.
Simplified example:
apiVersion: v1
kind: ConfigMap
metadata:
name: user-workload-monitoring-config
namespace: openshift-user-workload-monitoring
data:
config.yaml: |
prometheus:
retention: 7d
resources:
requests:
memory: 2Gi
alertmanager:
enabled: trueYou use this to:
- Enable/disable user workload monitoring
- Control resource usage and retention for user metrics
- Configure user-facing Alertmanager
Configuring Alertmanager for Notifications
Platform Alertmanager is configured from a Secret in openshift-monitoring, often named alertmanager-main (the Operator manages many details).
Example template of the Alertmanager configuration (YAML stored as string in the Secret):
global:
resolve_timeout: 5m
route:
receiver: 'default'
routes:
- match:
severity: critical
receiver: 'pager'
- match_re:
severity: "warning|info"
receiver: 'email'
receivers:
- name: 'default'
- name: 'pager'
webhook_configs:
- url: 'https://pagerduty.example.com/…'
- name: 'email'
email_configs:
- to: 'ops@example.com'Typical admin tasks:
- Configure email, webhook, or chat integrations for platform alerts
- Control how alerts are grouped and deduplicated
- Manage escalation paths based on severity or labels
User workload Alertmanager (if enabled) is configured similarly, but as a separate instance to handle application alerts.
Metrics Collection for Applications
The built-in stack gives you a standard way to have Prometheus scrape your apps:
Exposing Metrics
Applications need to expose an HTTP endpoint with metrics in Prometheus format, e.g.:
- Endpoint:
/metrics - Port: an HTTP port accessible within the cluster
- Content: text-based metrics such as:
# HELP http_requests_total Total number of HTTP requests
# TYPE http_requests_total counter
http_requests_total{method="GET",code="200"} 1024Most languages have Prometheus client libraries that do this automatically.
Using ServiceMonitor and PodMonitor
Instead of manually editing Prometheus configuration, you create custom resources:
ServiceMonitor- Targets
Serviceobjects - Common for applications with stable Services
PodMonitor- Targets Pods directly based on label selectors
- Useful when Services are not appropriate or for DaemonSets
Example ServiceMonitor (in a user namespace):
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: myapp-monitor
namespace: myapp-namespace
spec:
selector:
matchLabels:
app: myapp
endpoints:
- port: metrics
path: /metrics
interval: 30sWhen user workload monitoring is enabled:
- The user workload Prometheus automatically discovers
ServiceMonitor/PodMonitorobjects - It starts scraping the described endpoints with the defined interval and parameters
- No manual Prometheus config changes are required
Labeling and Multi-tenancy
Metrics scraped by user workload Prometheus are labeled to preserve context:
- Namespace labels, e.g.
namespace="myapp-namespace" - Workload labels, e.g.
deployment="myapp",pod="myapp-xyz" - Custom labels added by the application or
ServiceMonitor
RBAC and the query layer ensure:
- Users see metrics only for namespaces they are allowed to access
- An application team cannot query another team’s metrics in a multi-tenant cluster
Accessing Metrics and the Monitoring UI
The built-in stack is tightly integrated with the OpenShift web console and APIs.
Web Console Views
In the OpenShift web console you can:
- View dashboards with cluster and workload metrics
- Use the Metrics or Observe section to:
- Run PromQL queries
- Visualize time-series data
- Inspect alerts:
- Filter by severity or source (platform vs. user workloads)
- See alert description, runbook URLs (if configured), and current status
What you see is filtered by RBAC:
- Cluster admins can see platform metrics and alerts
- Namespace-scoped users see application metrics and alerts for their namespaces
Query Endpoints
Under the hood, queries go through the Thanos Querier or similar component, which:
- Talks to all relevant Prometheus instances
- Merges results
- Enforces access control
You can also use the oc CLI or HTTP APIs to:
- Query metrics via the
/api/v1/queryand/api/v1/query_rangeendpoints (behind appropriate routes) - Integrate with external dashboards such as Grafana (if allowed by platform policy)
Extending and Integrating the Built-in Stack
The built-in monitoring stack is designed as the default, supported solution; you can still integrate it with external systems.
Remote Write and External Storage
Depending on OpenShift version and configuration policy, the built-in Prometheus may support:
- Remote write to external systems (e.g., Thanos, Cortex, Mimir, VictoriaMetrics)
- Longer-term retention than what is configured in-cluster
- Central aggregation of metrics across multiple clusters
This is usually configured in the cluster monitoring configuration:
prometheusK8s:
remoteWrite:
- url: https://external-metrics.example.com/api/v1/write
writeRelabelConfigs:
- sourceLabels: [__name__]
regex: "container_.*"
action: keepCheck version-specific documentation and organizational policy before enabling remote write.
External Dashboards
While OpenShift provides built-in visualization, you can:
- Deploy Grafana or other tools in a separate namespace
- Point them at:
- The Thanos Querier endpoint (with proper authentication)
- Or external long-term storage that receives remote writes
This approach is common when you need:
- More complex dashboards
- Cross-cluster views
- Custom visualization not supported by the built-in console
Operational Considerations
The built-in monitoring stack is a first-class platform component; treating it like any other application may lead to issues. Key points:
Resource Usage and Sizing
Monitoring can be resource-intensive:
- Prometheus memory usage grows with:
- Number of time series
- Scrape interval
- Label cardinality (the number of distinct label combinations)
- Node-exporter and kube-state-metrics scale with number of nodes and objects
As an admin, you:
- Size Prometheus and related components via
cluster-monitoring-configanduser-workload-monitoring-config - Avoid overly frequent scrape intervals when not needed
- Encourage teams to design low-cardinality metrics
Storage and Retention
Prometheus uses local storage on persistent volumes:
- Retention is controlled by configuration (e.g.,
retention: 15d) - Larger retention → more disk space; also more data to query
Typical practices:
- Shorter retention in-cluster (days to weeks)
- Offload long-term storage to an external system via remote write
- Monitor disk usage and I/O saturation for Prometheus pods
Availability and Upgrades
Because monitoring is part of the platform:
- OpenShift Operators manage upgrades of monitoring components
- You should avoid manual modification of deployments or StatefulSets in
openshift-monitoringandopenshift-user-workload-monitoring - Changes should be done only via supported configuration mechanisms (ConfigMaps, Secrets, custom resources)
The Operators orchestrate:
- Rolling upgrades of Prometheus/Alertmanager
- Preservation of data across restarts (if persistent storage is configured)
- Consistent configuration across components
Typical Usage Patterns
To tie everything together, here is how the built-in stack is typically used:
- Cluster admin
- Monitors platform health via console dashboards
- Configures Alertmanager for infrastructure alerts
- Tunes retention and resource usage
- Optionally configures remote write to a central monitoring system
- Application team
- Exposes
/metricsendpoints in workloads - Creates
ServiceMonitororPodMonitorin their namespace - Queries application metrics via the console
- Uses alerts (via user workload Alertmanager) for application-specific conditions
The built-in monitoring stack provides a standardized, supported foundation for metrics and alerts throughout an OpenShift cluster, without requiring you to build and maintain your own Prometheus deployment from scratch.