12.1 Built-in monitoring stack

Table of Contents

Overview of the OpenShift Monitoring Stack

OpenShift ships with an opinionated, fully integrated monitoring stack built on top of popular open-source projects, primarily Prometheus and Alertmanager. This stack is:

Installed and managed by Operators
Split logically into cluster monitoring (platform) and user workload monitoring (applications)
Configured declaratively via custom resources
Designed to be upgradable and supported as part of the OpenShift platform

This chapter focuses on what is in that stack, how it is structured, and how you interact with it as a cluster admin and as an application developer.

Architecture and Components

The built-in monitoring stack is composed of multiple components running in specific namespaces, each with a distinct responsibility.

Core Components

Prometheus

Prometheus is the core time-series database and metrics scraper:

Scrapes metrics from endpoints exposed over HTTP
Stores metrics in a local time-series database
Provides a query language (PromQL) used by the web console and other tools

OpenShift runs multiple Prometheus instances for different scopes:

Cluster monitoring Prometheus in openshift-monitoring

Monitors the OpenShift platform (API server, etcd, SDN, kubelet, nodes, etc.)

User workload Prometheus in openshift-user-workload-monitoring (optional, but enabled by default in recent versions)

Monitors metrics from user namespaces/applications
Is tenant-aware: separates metrics per namespace and enforces RBAC

Alertmanager

Alertmanager:

Receives alert notifications from Prometheus
Handles:

Grouping of alerts
Inhibition (suppression) rules
Notification routing (email, webhook, etc.)

OpenShift provides a managed Alertmanager instance:

Configured via a Secret (for routes and receivers)
Integrated with the cluster’s authentication for UI access
Used by both cluster and user workload monitoring (with isolation rules)

Thanos Querier (or Aggregation Layer)

To provide a unified query entry point:

OpenShift uses Thanos Querier (or a similar aggregation component, depending on version) to:

Aggregate Prometheus instances
Provide a single endpoint for queries
Enforce RBAC for who can see which metrics

The OpenShift web console talks to this aggregated endpoint, not directly to each Prometheus.

Metrics Collectors and Exporters

Several components expose or collect metrics:

kube-state-metrics

Exposes metrics based on Kubernetes objects’ state (Deployments, Pods, etc.)

node-exporter

Runs on each node, exposing host-level metrics (CPU, memory, disk, network)

cAdvisor / kubelet metrics

Container-level resource usage metrics (CPU, memory, filesystem, etc.)

OpenShift component exporters

API server, controllers, SDN, registry, etc. all expose metrics endpoints.

Namespaces and Logical Separation

The monitoring stack is split by namespaces, which also reflect responsibilities:

openshift-monitoring

Cluster (platform) monitoring components
Prometheus, Alertmanager, Thanos-querier, kube-state-metrics, node-exporter, etc.

openshift-user-workload-monitoring

Optional monitoring for user workloads
Separate Prometheus and Alertmanager configured to scrape user namespaces

User namespaces

Applications expose metrics endpoints
You create ServiceMonitor/PodMonitor objects here when user workload monitoring is enabled.

This separation allows:

Clear support boundaries (platform vs. user)
Different resource constraints and retention periods
Security and multi-tenancy for user metrics

Cluster vs User Workload Monitoring

The built-in monitoring stack is deliberately split into cluster monitoring and user workload monitoring. Their differences are important both operationally and for security.

Cluster Monitoring (Platform)

Cluster monitoring is:

Enabled by default
Fully managed by OpenShift
Intended for monitoring:

Control plane components
Worker nodes and system services
Cluster-level infrastructure

Key characteristics:

Runs only in openshift-monitoring
Configuration managed with the ClusterMonitoringConfiguration (a ConfigMap)
Metrics retention and storage are tuned for platform SLIs and SLOs
Not meant for scraping arbitrary user applications

As an administrator, you typically:

Adjust retention and resource usage
Configure remote write (if supported for your version)
Configure Alertmanager routes for platform alerts
Use it to monitor cluster health and performance

User Workload Monitoring

User workload monitoring:

Is separate from cluster monitoring
Targets user namespaces and application-level metrics
Is optional but recommended for integrating application metrics with the platform

When enabled:

A separate Prometheus (and Alertmanager) stack is deployed in openshift-user-workload-monitoring
You define ServiceMonitor or PodMonitor resources in user namespaces to specify what to scrape
RBAC ensures users only see metrics and alerts relevant to namespaces they can access

This separation allows:

Independent scaling and retention settings for application metrics
Isolation between tenants in multi-tenant clusters
Reduced risk of user metrics impacting core platform monitoring

Configuration of the Built-in Monitoring Stack

The monitoring stack is managed primarily through custom resources and configuration objects that Operators reconcile. You do not manually edit the deployments.

Cluster Monitoring Configuration

Cluster-level configuration is done via a ConfigMap called cluster-monitoring-config in openshift-monitoring (exact name may vary slightly by version).

Typical structure (simplified):

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    prometheusK8s:
      retention: 15d
      resources:
        requests:
          memory: 2Gi
        limits:
          memory: 4Gi
    alertmanagerMain:
      resources:
        requests:
          memory: 512Mi
    telemeterClient:
      enabled: true

You use this to:

Set Prometheus retention (retention)
Tune CPU/memory resource requests and limits
Enable/disable specific components
Configure storage settings (e.g., persistent volumes for metrics)
Configure remote write to external systems (if supported by your version and policy)

The Cluster Monitoring Operator reconciles this config and adjusts deployments automatically.

User Workload Monitoring Configuration

User workload monitoring is configured via a ConfigMap in openshift-user-workload-monitoring, often called user-workload-monitoring-config.

Simplified example:

apiVersion: v1
kind: ConfigMap
metadata:
  name: user-workload-monitoring-config
  namespace: openshift-user-workload-monitoring
data:
  config.yaml: |
    prometheus:
      retention: 7d
      resources:
        requests:
          memory: 2Gi
    alertmanager:
      enabled: true

You use this to:

Enable/disable user workload monitoring
Control resource usage and retention for user metrics
Configure user-facing Alertmanager

Configuring Alertmanager for Notifications

Platform Alertmanager is configured from a Secret in openshift-monitoring, often named alertmanager-main (the Operator manages many details).

Example template of the Alertmanager configuration (YAML stored as string in the Secret):

global:
  resolve_timeout: 5m
route:
  receiver: 'default'
  routes:
  - match:
      severity: critical
    receiver: 'pager'
  - match_re:
      severity: "warning|info"
    receiver: 'email'
receivers:
- name: 'default'
- name: 'pager'
  webhook_configs:
  - url: 'https://pagerduty.example.com/…'
- name: 'email'
  email_configs:
  - to: 'ops@example.com'

Typical admin tasks:

Configure email, webhook, or chat integrations for platform alerts
Control how alerts are grouped and deduplicated
Manage escalation paths based on severity or labels

User workload Alertmanager (if enabled) is configured similarly, but as a separate instance to handle application alerts.

Metrics Collection for Applications

The built-in stack gives you a standard way to have Prometheus scrape your apps:

Exposing Metrics

Applications need to expose an HTTP endpoint with metrics in Prometheus format, e.g.:

Endpoint: /metrics
Port: an HTTP port accessible within the cluster
Content: text-based metrics such as:

# HELP http_requests_total Total number of HTTP requests
# TYPE http_requests_total counter
http_requests_total{method="GET",code="200"} 1024

Most languages have Prometheus client libraries that do this automatically.

Using ServiceMonitor and PodMonitor

Instead of manually editing Prometheus configuration, you create custom resources:

ServiceMonitor

Targets Service objects
Common for applications with stable Services

PodMonitor

Targets Pods directly based on label selectors
Useful when Services are not appropriate or for DaemonSets

Example ServiceMonitor (in a user namespace):

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: myapp-monitor
  namespace: myapp-namespace
spec:
  selector:
    matchLabels:
      app: myapp
  endpoints:
  - port: metrics
    path: /metrics
    interval: 30s

When user workload monitoring is enabled:

The user workload Prometheus automatically discovers ServiceMonitor/PodMonitor objects
It starts scraping the described endpoints with the defined interval and parameters
No manual Prometheus config changes are required

Labeling and Multi-tenancy

Metrics scraped by user workload Prometheus are labeled to preserve context:

Namespace labels, e.g. namespace="myapp-namespace"
Workload labels, e.g. deployment="myapp", pod="myapp-xyz"
Custom labels added by the application or ServiceMonitor

RBAC and the query layer ensure:

Users see metrics only for namespaces they are allowed to access
An application team cannot query another team’s metrics in a multi-tenant cluster

Accessing Metrics and the Monitoring UI

The built-in stack is tightly integrated with the OpenShift web console and APIs.

Web Console Views

In the OpenShift web console you can:

View dashboards with cluster and workload metrics
Use the Metrics or Observe section to:

Run PromQL queries
Visualize time-series data

Inspect alerts:

Filter by severity or source (platform vs. user workloads)
See alert description, runbook URLs (if configured), and current status

What you see is filtered by RBAC:

Cluster admins can see platform metrics and alerts
Namespace-scoped users see application metrics and alerts for their namespaces

Query Endpoints

Under the hood, queries go through the Thanos Querier or similar component, which:

Talks to all relevant Prometheus instances
Merges results
Enforces access control

You can also use the oc CLI or HTTP APIs to:

Query metrics via the /api/v1/query and /api/v1/query_range endpoints (behind appropriate routes)
Integrate with external dashboards such as Grafana (if allowed by platform policy)

Extending and Integrating the Built-in Stack

The built-in monitoring stack is designed as the default, supported solution; you can still integrate it with external systems.

Remote Write and External Storage

Depending on OpenShift version and configuration policy, the built-in Prometheus may support:

Remote write to external systems (e.g., Thanos, Cortex, Mimir, VictoriaMetrics)
Longer-term retention than what is configured in-cluster
Central aggregation of metrics across multiple clusters

This is usually configured in the cluster monitoring configuration:

prometheusK8s:
  remoteWrite:
  - url: https://external-metrics.example.com/api/v1/write
    writeRelabelConfigs:
    - sourceLabels: [__name__]
      regex: "container_.*"
      action: keep

Check version-specific documentation and organizational policy before enabling remote write.

External Dashboards

While OpenShift provides built-in visualization, you can:

Deploy Grafana or other tools in a separate namespace
Point them at:

The Thanos Querier endpoint (with proper authentication)
Or external long-term storage that receives remote writes

This approach is common when you need:

More complex dashboards
Cross-cluster views
Custom visualization not supported by the built-in console

Operational Considerations

The built-in monitoring stack is a first-class platform component; treating it like any other application may lead to issues. Key points:

Resource Usage and Sizing

Monitoring can be resource-intensive:

Prometheus memory usage grows with:

Number of time series
Scrape interval
Label cardinality (the number of distinct label combinations)

Node-exporter and kube-state-metrics scale with number of nodes and objects

As an admin, you:

Size Prometheus and related components via cluster-monitoring-config and user-workload-monitoring-config
Avoid overly frequent scrape intervals when not needed
Encourage teams to design low-cardinality metrics

Storage and Retention

Prometheus uses local storage on persistent volumes:

Retention is controlled by configuration (e.g., retention: 15d)
Larger retention → more disk space; also more data to query

Typical practices:

Shorter retention in-cluster (days to weeks)
Offload long-term storage to an external system via remote write
Monitor disk usage and I/O saturation for Prometheus pods

Availability and Upgrades

Because monitoring is part of the platform:

OpenShift Operators manage upgrades of monitoring components
You should avoid manual modification of deployments or StatefulSets in openshift-monitoring and openshift-user-workload-monitoring
Changes should be done only via supported configuration mechanisms (ConfigMaps, Secrets, custom resources)

The Operators orchestrate:

Rolling upgrades of Prometheus/Alertmanager
Preservation of data across restarts (if persistent storage is configured)
Consistent configuration across components

Typical Usage Patterns

To tie everything together, here is how the built-in stack is typically used:

Cluster admin

Monitors platform health via console dashboards
Configures Alertmanager for infrastructure alerts
Tunes retention and resource usage
Optionally configures remote write to a central monitoring system

Application team

Exposes /metrics endpoints in workloads
Creates ServiceMonitor or PodMonitor in their namespace
Queries application metrics via the console
Uses alerts (via user workload Alertmanager) for application-specific conditions

The built-in monitoring stack provides a standardized, supported foundation for metrics and alerts throughout an OpenShift cluster, without requiring you to build and maintain your own Prometheus deployment from scratch.

Comments

Please login to add a comment.

Don't have an account? Register now!