Table of Contents
Overview of the Kubernetes Control Plane
In Kubernetes, the control plane is the “brain” of the cluster. It makes global decisions (scheduling, scaling, reacting to failures) and provides a consistent view of the cluster state. Worker nodes run your application containers; the control plane decides what should run where and when.
This chapter focuses on the standard, upstream Kubernetes control plane components and how they work together:
kube-apiserveretcdkube-schedulerkube-controller-managercloud-controller-manager(in cloud environments)
You will later see how OpenShift builds on these ideas, but here we focus on the generic Kubernetes concepts.
kube-apiserver
kube-apiserver is the front door to the Kubernetes control plane.
Role
- Acts as the single entry point for all operations on the cluster.
- Exposes a REST API (the Kubernetes API).
- Validates and processes all requests:
- From users and automation tools (
kubectl, CI/CD, etc.). - From other control plane components.
- From the worker nodes.
Conceptually, the API server is the only component that talks directly to the cluster’s persistent store (etcd).
Responsibilities
- Authentication and authorization
- Verifies who you are (authn).
- Checks what you are allowed to do (authz, usually via RBAC).
- Admission control
- Runs admission plugins (e.g., to enforce quotas, security policies, defaults).
- Object validation
- Ensures submitted objects (Pods, Deployments, etc.) are syntactically and semantically valid.
- Serving and updating cluster state
- Persists valid objects in
etcd. - Serves current cluster state to clients via the API.
- Watch mechanism
- Other components subscribe (“watch”) for changes.
- When objects change, the API server streams updates to watchers.
How other components use the API server
Other control plane components never write directly to etcd. Instead, they:
- Read desired and current state via the API.
- Compute what needs to change.
- Write their changes back via the API.
This pattern (read–reconcile–write via the API server) is central to how Kubernetes works.
etcd
etcd is the strongly consistent, distributed key–value store backing the Kubernetes API.
Role
- Acts as the single source of truth for:
- All Kubernetes objects (Pods, Deployments, Services, etc.).
- Cluster configuration and metadata.
- Stores the desired state of the cluster as maintained by the API server.
In production, etcd is usually run as a highly available cluster (odd number of members) to tolerate failures.
Characteristics Relevant to Kubernetes
- Strong consistency:
- Every successful write is guaranteed to be visible to subsequent reads.
- Important for correctness of scheduling and controllers.
- Watch support:
- API server watches
etcdfor changes and relays them to clients. - Versioning and history:
- Kubernetes objects have resource versions, enabling safe concurrency and optimistic locking.
What gets stored in etcd
Examples of data types:
- Cluster-scoped objects:
Nodes,Namespaces,ClusterRoles, etc. - Namespaced objects:
Pods,Deployments,Services,ConfigMaps,Secrets, etc. - Internal configuration: API server configuration, leases, locks.
While you rarely interact with etcd directly in managed environments, its performance and reliability directly affect API responsiveness and cluster stability.
kube-scheduler
kube-scheduler assigns Pods to Nodes.
Role
- Watches for unscheduled Pods (Pods with no Node assigned).
- Chooses the “best” Node for each Pod according to scheduling policies.
- Writes the scheduling decision back via the API server by setting the Pod’s
spec.nodeName.
It does not start containers itself; that is the job of the node’s kubelet. The scheduler only decides placement.
Scheduling process (high level)
For each pending Pod:
- Filter (Predicates)
- Eliminate Nodes that cannot run the Pod.
- Examples:
- Insufficient CPU / memory.
- Node doesn’t match the Pod’s node selector or affinities.
- Node is tainted in a way the Pod cannot tolerate.
- Required volumes or resources aren’t available.
- Score (Priorities)
- Rank the remaining Nodes.
- Examples:
- Prefer Nodes with more free resources.
- Spread Pods across failure domains (zones, nodes).
- Honor Pod affinity/anti-affinity.
- Bind
- Select the highest-scoring Node.
- Write a binding decision (or update the Pod) through the API server.
If no suitable Node exists, the Pod remains pending until conditions change (e.g., a new Node joins or resources free up).
Extensibility
The scheduler is pluggable:
- Scheduling profiles and plugins can alter:
- How filtering/scoring is done.
- How scheduling decisions are made.
- Third-party or custom schedulers can be used for specialized workloads (e.g., GPU-heavy, latency-sensitive, or batch jobs).
kube-controller-manager
kube-controller-manager runs a collection of built-in controllers that continuously reconcile different aspects of cluster state.
Reconciliation concept
Each controller implements a loop:
- Observe desired state from the API server (e.g., a
Deploymentobject). - Observe current state of related resources (e.g., existing Pods, ReplicaSets).
- Compare desired vs. current.
- Act: create, update, or delete Kubernetes objects so that actual state moves toward desired state.
This is often described as:
$$
\text{Reconcile loop:}\ \text{desired state} - \text{current state} \rightarrow \text{actions}
$$
Examples of built-in controllers
A few important controllers that typically run inside kube-controller-manager:
- Node Controller
- Monitors Node health.
- Marks Nodes as
NotReadyif they stop reporting. - May trigger Pod eviction and rescheduling from unhealthy Nodes.
- Replication / ReplicaSet Controller
- Ensures the specified number of Pod replicas are running.
- Creates or deletes Pods to match the desired replica count.
- Deployment Controller
- Manages rolling updates and rollbacks for Deployments.
- Creates and manages ReplicaSets for each version of the application.
- Namespace Controller
- Handles cleanup when a Namespace is deleted (removes resources in that namespace).
- Service Account & Token Controllers
- Create default ServiceAccounts and associated credentials.
- EndpointSlice / Service Controllers
- Maintain network endpoint information so Services can route traffic to the correct Pods.
All of these controllers use the same pattern: watch → compare → reconcile via the API server.
cloud-controller-manager
cloud-controller-manager runs controllers that interact with the underlying cloud provider.
This component only appears in clusters integrated with cloud infrastructures (e.g., AWS, GCP, Azure, OpenStack). In on-prem or bare-metal setups, it may be absent or replaced with different integrations.
Why it exists
Originally, cloud-specific logic was deeply embedded into core Kubernetes binaries. cloud-controller-manager was introduced to:
- Decouple cloud provider code from core Kubernetes.
- Allow cloud providers to develop and ship their integrations independently.
- Make Kubernetes more portable and modular.
Typical cloud-related controllers
Examples of controllers that may run in cloud-controller-manager:
- Node Controller (cloud-specific)
- Adds cloud-specific metadata (labels for zone, region, instance type).
- Detects if a Node instance has been deleted at the cloud provider level.
- Route Controller
- Configures network routes in the cloud provider to enable inter-node networking.
- Service Controller
- Implements
LoadBalancerServices by: - Creating cloud load balancers.
- Attaching backend instances.
- Updating Service
statuswith external IPs/hostnames. - Volume Controller (in some setups)
- Manages cloud volumes (EBS, Persistent Disks, etc.) used by PersistentVolumes.
How the Control Plane Components Work Together
To understand the interaction, consider a high-level example: creating a new Deployment.
- User submits a Deployment
- A user (or CI/CD system) sends a request to
kube-apiserverto create aDeployment. - API server:
- Authenticates and authorizes the request.
- Validates the object.
- Stores it in
etcd. - Controllers react
- The Deployment controller (in
kube-controller-manager) watches Deployments. - It sees the new Deployment and creates a
ReplicaSetwith the desired number of replicas. - The ReplicaSet controller then creates the required number of Pods via the API server.
- Scheduler assigns Pods
kube-schedulerwatches for Pods with nonodeName.- It evaluates Nodes, picks suitable ones, and writes binding decisions via the API server.
- Nodes run the Pods
- Node-local components (like
kubelet, described elsewhere) see the assigned Pods. - They pull images, start containers, and report status back to the API server.
- Ongoing reconciliation
- If a Pod fails, the ReplicaSet controller notices fewer running replicas than desired and creates a replacement.
- If a Node disappears, the Node controller marks it unhealthy; pods are eventually rescheduled by the scheduler.
Throughout this process:
kube-apiserveris the hub.etcdis the backing store.- Controllers and scheduler implement control loops to drive the cluster toward the desired state.
High Availability and Scalability Considerations
In production environments, control plane components are usually run in a highly available and scalable configuration:
- Multiple API server instances
- Run behind a load balancer.
- Share the same
etcdcluster. - Etcd cluster
- At least 3 members (often 3 or 5).
- Deployed on separate nodes for fault tolerance.
- Scheduler and controller-manager
- Typically run as multiple instances, but use leader election so only one active instance of each type performs work at any time.
- If the leader fails, another instance takes over.
This design aims to keep the cluster functional even if individual control plane nodes fail.
Summary
The Kubernetes control plane is composed of cooperative components, each with a focused responsibility:
kube-apiserver– central API, security, and gateway to cluster state.etcd– strongly consistent store for all cluster data.kube-scheduler– decides where Pods should run.kube-controller-manager– runs controllers that continuously reconcile desired vs. current state.cloud-controller-manager– integrates with external/cloud infrastructure where applicable.
Together, they implement Kubernetes’s declarative model: you describe the desired state, and the control plane continuously works to make the cluster match that description.