12.3 Logging architecture

Table of Contents

Key Goals of the Logging Architecture in OpenShift

OpenShift’s logging architecture is designed to:

Collect logs from all relevant components (application containers, infrastructure, and platform services).
Normalize and enrich logs for search and analysis.
Route logs to one or more backends (e.g., Loki, Elasticsearch, external SIEM).
Enforce multi-tenancy and security boundaries.
Provide operators and developers with flexible ways to query and retain logs.

This chapter focuses on how these goals are achieved structurally, not on how to use specific tools step-by-step.

Main Components of the OpenShift Logging Stack

While implementations evolve between OpenShift versions, the logical architecture typically includes:

Log collectors/forwarders: Agents on each node that read logs and send them to a central store or external systems.
Log storage and indexing: Systems that receive, store, and index logs for querying (e.g., Loki, Elasticsearch).
Log routing and pipelines: Rules and configuration that define what logs go where and how they are transformed.
Access and visualization layer: User interfaces and APIs to query, filter, and export logs (e.g., web console integration, CLI, Kibana/Grafana).

Node-Level Log Collection

On every node, logs are generated from:

Containers and pods (stdout/stderr of containers, container runtime logs).
Node system services and kubelet.
OpenShift platform components running as pods on that node.

A log collector runs as a DaemonSet so that:

Each node has a collector pod.
The collector watches log files under locations like /var/log/containers and /var/log/pods (paths can differ slightly by runtime and version).
Logs are tagged with metadata such as:

Namespace
Pod name
Container name
Node name
Labels/annotations (selected subsets)

This metadata is critical for multi-tenant access control and for routing.

Log Types and Logical Separation

OpenShift conceptually separates logs into three categories:

Application logs

Anything emitted by user workloads (pods in user namespaces).

Infrastructure logs

Logs from cluster infrastructure components, including nodes and selected infrastructure pods.

Audit logs

Kubernetes / OpenShift API audit events, recording who did what, where, and when.

This separation allows:

Different retention periods per category.
Different routing targets (e.g., audit logs to a secure archival system).
Different access control rules.

In the logging architecture, these categories are implemented as different pipelines or indices/tenants in the backend.

OpenShift Logging Stacks: Classic vs Modern

The concrete components differ by OpenShift release, but the architectural roles remain similar.

Classic EFK-Based Architecture (Elasticsearch, Fluentd/Fluent Bit, Kibana)

Older or “classic” OpenShift logging often uses an EFK pattern:

Collector: Fluentd or Fluent Bit DaemonSet on each node.
Storage/Index: Elasticsearch cluster deployed inside the OpenShift cluster.
Visualization: Kibana connected to Elasticsearch.

Architectural characteristics:

Fluentd/Fluent Bit:

Tails log files, adds metadata, and pushes logs to Elasticsearch.
Can route different log types to different indices.

Elasticsearch:

Stores logs in indices such as app-, infra-, audit-*.
Supports full-text search and structured queries.

Kibana:

Visual interface for searches, dashboards, and saved queries.

This architecture is resource-intensive (especially Elasticsearch) and often used together with Operator-based management.

Loki-Based Architecture and Logging with Operators

More recent OpenShift logging stacks commonly use:

Cluster Logging Operator
LokiStack (via Red Hat Loki Operator or similar)
Log Forwarding Custom Resources

Typical structure:

Collector:

Fluent Bit or Vector as the per-node collector.

Central Storage:

Loki (log-structured, object-storage-friendly system).

Query Layer:

Loki’s query frontend, often integrated with OpenShift Console or Grafana.

Architectural implications:

Logs are stored as compressed streams grouped by labels.
Multi-tenancy is often enforced via per-tenant or per-namespace label sets and authentication tokens.
Object storage backends (e.g., S3-compatible) are used, allowing separation of compute and storage.

Log Routing and Pipelines

The routing logic is central to the logging architecture. It defines:

Which logs are collected.
How they are transformed (filtering, parsing, redaction).
Where they are sent (internal Loki/Elasticsearch, external systems).

ClusterLogForwarder and Pipelines (Conceptual Model)

In recent OpenShift releases, log forwarding is configured via CRDs like ClusterLogForwarder:

Inputs:

Specify log sources: application, infrastructure, audit.
Can filter by namespace, labels, or other criteria.

Filters/Transformations:

Parse structured logs (JSON).
Drop or mask sensitive fields.
Rewrite fields or add additional metadata.

Outputs:

Internal logging stack (e.g., Loki).
External endpoints:

Syslog.
HTTP/S endpoints.
Elasticsearch clusters outside OpenShift.
Cloud logging services or SIEM tools.

Pipelines connect inputs to outputs, optionally with filters in between:

Each pipeline is a flow: input → filter(s) → output.
Different pipelines can implement:

Separate paths for audit logs (more secure, immutable).
Dedicated routing for logs of a specific application group.
Multi-destination routing (e.g., both internal Loki and external SIEM).

Multi-Tenancy and Isolation in Log Routing

The architecture must align with OpenShift’s multi-tenant model:

Namespace-level isolation:

Users can see logs only from namespaces they are authorized to access.

Routing rules:

Prevent cross-tenant leakage by carefully controlling which labels and fields are used for indexing and access.

Dedicated outputs (optional):

Some tenants or applications can have their own external destinations.
Example: a regulated team sends a copy of their logs to a compliant external archive.

This is implemented via:

Label-based filters in the collector.
Per-tenant identities enforced at the query layer (e.g., via OAuth, tokens, or RBAC in the console).

Storage, Retention, and Scalability

Storage Layout

In the backend (Elasticsearch or Loki), the architecture typically organizes logs by:

Time (indices or chunks partitioned by date).
Category (application/infra/audit).
Tenant or namespace labels.

This layout supports:

Time-based retention policies.
Efficient queries limited to relevant indices/chunks.

Retention and Lifecycle Management

The logging architecture supports different retention policies, such as:

Short retention (e.g., days) for high-volume application logs.
Longer retention for audit logs due to compliance needs.
Tiering:

Hot storage for recent logs (fast search).
Warm/cold or archived storage for older logs (slower, cheaper).

Actual implementation can involve:

Index lifecycle policies (in Elasticsearch).
Object storage lifecycle rules (for Loki or archived indices).
Automated deletion or movement of old data.

Scaling Considerations

The architecture scales horizontally by:

Adding more collector pods (inherent via DaemonSets as node count grows).
Scaling the log storage cluster:

More Elasticsearch data nodes or more Loki ingesters/queriers.

Using sharding strategies:

Partitioning indices by time and category.
Distributing queries across multiple backend nodes.

Key architectural trade-offs:

Higher ingestion rates vs. resource consumption.
Indexing depth and field cardinality vs. query performance.
On-cluster storage vs. external / managed logging services.

Integration Points and Access Paths

Integration with the OpenShift Web Console

The logging architecture integrates with the console to provide:

Pod log viewing:

Stream or tail logs directly from running pods.
When backed by a central store, also access older logs beyond the node’s local retention.

Central log views:

Cluster-wide or namespace-scoped log search (depending on configuration).
Links from workloads or events to relevant logs.

From an architectural standpoint:

The console communicates with the backend APIs (e.g., Loki, Elasticsearch) through authenticated, RBAC-aware endpoints.
The user’s cluster permissions determine which logs can be queried.

External Logging and SIEM Systems

Many environments integrate OpenShift logs into existing enterprise logging platforms. Architecturally, this is enabled by:

Log forwarders that send logs to:

External Elasticsearch clusters.
Syslog collectors (RFC-compliant).
HTTP endpoints (for cloud logging or custom collectors).

Format transformations:

Normalize fields to match existing logging schemas.
Wrap or encode messages as required by the target system.

The internal logging stack may:

Act purely as a local cache/buffer before forwarding.
Or be bypassed if all logs are sent directly to external systems.

Security and Compliance Aspects of the Architecture

While detailed security topics belong to security chapters, the logging architecture structurally supports:

Secure transport:

TLS between collectors and backends.
Mutually authenticated connections when required.

Access control:

Backend roles and tenants aligned with Kubernetes RBAC.
Separate endpoints or tenants for audit logs.

Immutability:

Restricted modification/deletion of audit indices.
Append-only write patterns in backends like Loki.

Architectural choices here are driven by:

Regulatory requirements (e.g., log tamper evidence).
Organizational policies on who can access which log categories.

High-Level Data Flow Summary

Bringing the pieces together, a typical OpenShift logging data flow looks like:

Log generation

Containers, nodes, control-plane components, and APIs emit logs.

Collection on nodes

A daemonset collector reads log files or streams and attaches metadata.

Classification and routing

Logs are classified as application, infrastructure, or audit.
Routing pipelines decide destinations and apply transformations.

Transmission

Logs are sent over secure channels to:

Internal logging backend(s).
External log receivers and SIEM tools.

Storage and indexing

Backends store logs partitioned by time, category, and tenant.
Indices or streams are managed by lifecycle and retention policies.

Access and analysis

Users and operators query logs via:

OpenShift web console.
CLI tools.
Backend-native interfaces like Kibana or Grafana.

RBAC and backend permissions govern visibility.

This logical architecture ensures that, as clusters and workloads scale, logs remain centralized, searchable, and governed according to organizational policies.

Comments

Please login to add a comment.

Don't have an account? Register now!