Table of Contents
Why rolling updates matter
Rolling updates let you deploy new versions of your application without stopping service. Instead of replacing all pods at once, OpenShift gradually replaces old pods with new ones, keeping the application available.
Rollbacks are the counterpart: if something goes wrong with a new version, you quickly return to a previously working state.
In OpenShift, rolling behavior is implemented by:
Deploymentobjects (Kubernetes-native)DeploymentConfigobjects (OpenShift-specific)
Both support rolling updates and rollbacks, but the mechanisms and knobs differ slightly.
This chapter focuses on:
- How rolling updates work conceptually in OpenShift
- How to control rollout behavior
- How to monitor rollouts
- How to perform and understand rollbacks
- Common pitfalls and best practices
Rolling updates with Deployments
A Deployment manages a ReplicaSet, which in turn manages pods. During a rolling update:
- A new
ReplicaSetis created for the new version. - Pods from the new
ReplicaSetare gradually scaled up. - Pods from the old
ReplicaSetare gradually scaled down. - Service traffic is continuously routed to all ready pods (old + new) via the underlying
Service.
You control the rollout strategy via the spec.strategy field of the Deployment.
RollingUpdate strategy
For a rolling update, spec.strategy.type is RollingUpdate (the default):
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 25%
maxSurge: 25%Key parameters:
maxUnavailable- Maximum number of pods that can be unavailable during the update
- Can be an absolute number (
1,2, …) or a percentage (25%) - Affects how aggressively old pods can be taken down before new ones become ready
maxSurge- Maximum number of extra pods (beyond the desired replica count) allowed during the update
- Also numeric or percentage
- Affects how many new pods can be started at once
Example:
- Desired replicas: 4
maxUnavailable: 1maxSurge: 1
During the rollout:
- Up to 5 pods may run at once (4 desired + 1 surge).
- At least 3 pods must remain available.
Choosing values is a trade-off between:
- Availability (keep more pods of any version running)
- Speed (roll out faster with larger
maxSurge/maxUnavailable) - Resource usage (surge pods consume additional CPU/memory/storage)
Deployment update triggers
Deployments create new revisions when certain fields change, most commonly:
- Container image (
spec.template.spec.containers[].image) - Environment variables
- ConfigMap/Secret references (if you change the Pod template to use new keys or names)
- Labels/annotations on the Pod template
A typical update workflow:
- Edit the
Deployment(YAML,oc set image, or via the web console). - A new
ReplicaSetis created. - Rolling update starts according to the configured strategy.
Example using oc:
oc set image deployment/myapp mycontainer=image-registry.example.com/team/myapp:v2
This triggers a new revision of the Deployment and starts a rolling update.
Controlling progress and timeouts
Deployments track rollout progress:
spec.progressDeadlineSecondsdefines how long OpenShift/Kubernetes waits for the rollout to make progress.- If the rollout stalls (e.g., pods crash-loop or never become ready), it is marked as failed.
Example:
spec:
progressDeadlineSeconds: 600 # 10 minutes
If pods fail readiness checks or cannot be scheduled, the Deployment will not complete, and you’ll see conditions like ProgressDeadlineExceeded.
OpenShift commands to inspect:
oc rollout status deployment/myapp
oc describe deployment/myappRolling updates with DeploymentConfigs
DeploymentConfig is an OpenShift-specific controller with its own rolling strategy and trigger system. It behaves similarly to a Deployment but uses ReplicationController objects instead of ReplicaSet.
Rolling strategy in DeploymentConfig
The rolling strategy is defined under spec.strategy.type: Rolling:
spec:
strategy:
type: Rolling
rollingParams:
maxUnavailable: 25%
maxSurge: 25%
intervalSeconds: 1
timeoutSeconds: 600
updatePeriodSeconds: 1Important parameters:
maxUnavailable/maxSurge- Same concept as for
Deployment. intervalSeconds- Time between polling deployment status and making progress decisions.
updatePeriodSeconds- Time to wait between individual pod updates (throttling the rollout).
timeoutSeconds- Overall timeout for the rollout; if exceeded, the deployment is considered failed.
You can also configure pre/post lifecycle hooks in rollingParams (e.g., to run migrations before switching fully to new pods), but hook details are typically treated in more advanced chapters.
DeploymentConfig triggers and image changes
DeploymentConfig supports triggers to start new deployments automatically:
Typical triggers:
- Image change trigger:
- Deploys automatically when a referenced image stream tag changes.
- Config change trigger:
- Deploys when the pod template changes in the
DeploymentConfig.
Example snippet:
spec:
triggers:
- type: ConfigChange
- type: ImageChange
imageChangeParams:
automatic: true
containerNames:
- mycontainer
from:
kind: ImageStreamTag
name: myapp:latest
Whenever the ImageStreamTag myapp:latest is updated, a new rollout is initiated.
`oc` commands for DeploymentConfigs
Key commands:
- Start a new rollout manually:
oc rollout latest dc/myapp- Watch rollout:
oc rollout status dc/myapp
The rollout behavior is still governed by rollingParams even if the rollout is manual.
Monitoring and managing rollouts
Regardless of whether you use Deployment or DeploymentConfig, you need to:
- Track rollout progress
- Inspect failures
- Potentially pause or resume rollouts
Inspecting rollout status
For Deployments:
oc rollout status deployment/myapp
oc get deployment myapp
oc describe deployment myappFor DeploymentConfigs:
oc rollout status dc/myapp
oc get dc myapp
oc describe dc myappTypical things to look for:
- Conditions such as
Progressing,Available,ReplicaFailure - Events showing scheduling issues, image pull errors, or readiness probe failures
- Number of updated/available/unavailable replicas
Pausing and resuming rollouts (Deployments only)
Deployments support pausing:
oc rollout pause deployment/myapp
# modify spec.template, e.g., add environment variables, sidecars, etc.
oc rollout resume deployment/myappWhile paused, changes to the deployment specification are recorded but not applied to pods until you resume. This is useful for batching multiple changes into a single rollout.
Rollbacks: reverting to a previous version
If a new version misbehaves (errors, performance issues, failed health checks), you can roll back.
The core idea:
- Each rollout is stored as a revision (Deployment revision or
DeploymentConfigversion). - A rollback sets the controller’s template back to a previous revision and starts another rolling update.
Rollbacks with Deployments
To view revisions and history:
oc rollout history deployment/myapp
oc rollout history deployment/myapp --revision=3To roll back to the previous revision:
oc rollout undo deployment/myappTo roll back to a specific revision:
oc rollout undo deployment/myapp --to-revision=3Outcome:
- The Deployment spec (pod template) is reverted to the target revision.
- A new rolling update is started from the current pods to the reverted template.
Note:
- By default, Deployments keep a number of old
ReplicaSets(spec.revisionHistoryLimit). If old revisions are garbage-collected, you cannot roll back to them.
Rollbacks with DeploymentConfigs
DeploymentConfigs have a similar concept, with an internal version number.
View rollout history:
oc rollout history dc/myapp
oc rollout history dc/myapp --revision=3Rollback:
oc rollout undo dc/myapp
# or to a specific revision:
oc rollout undo dc/myapp --to-revision=3
You can also manually set spec.template from a previous revision if you need finer control, but oc rollout undo is usually sufficient.
What gets reverted (and what doesn’t)
Rollbacks typically revert:
- Pod template (container images, env vars, ports, volumes, probes)
- Labels/annotations on the template
They do not revert:
- Persistent data in volumes or databases
- External services or configuration outside this specific controller
- Other resources like ConfigMaps or Secrets that may have changed independently
This means a “rollback” in OpenShift is a workload configuration rollback, not a full environment time machine.
Dealing with failed rollouts
During a rollout, failures can occur due to:
- Broken container image (crash loop, application error)
- Failing liveness or readiness probes
- Misconfigured environment variables or secrets
- Resource constraints (pods cannot schedule)
- Networking issues
Detecting failures
Use:
oc rollout status deployment/myapp
oc logs deployment/myapp
oc describe pod <pod-name>or for DeploymentConfigs:
oc rollout status dc/myapp
oc logs dc/myapp
oc describe pod <pod-name>Signs of a failed rollout:
oc rollout statusdoes not complete or reports failure- Pods stuck in
CrashLoopBackOfforImagePullBackOff - Conditions such as
ProgressDeadlineExceeded
Reacting to failures
Common responses:
- Immediate rollback:
- If the new version is clearly broken, use
oc rollout undoto restore service quickly. - Fix-forward:
- If the issue is minor and quick to fix, you may choose to push a corrected version as a new rollout rather than rolling back.
- Adjust strategy:
- If failures are related to load or capacity, you might:
- Reduce
maxSurge/maxUnavailable - Increase
resourcesorlimits - Re-tune probes so they reflect realistic startup and readiness behavior
Zero-downtime and readiness considerations
Rolling updates depend heavily on pods becoming ready before they are counted toward available capacity.
Key aspects:
- Readiness probes:
- Until a pod passes its readiness probe, it will not receive traffic via Services or Routes.
- If you misconfigure readiness (too strict, wrong path/port), the rollout can stall.
- Shutdown behavior:
- When a pod is terminated during a rolling update, it receives a
SIGTERMand has a grace period (terminationGracePeriodSeconds) to shut down gracefully. - If your app ignores termination signals or takes too long, users may experience errors.
- Stateful components:
- For stateful or session-heavy apps, consider:
- Allowing some time for connections to drain before killing old pods.
- Using readiness/lifecycle hooks (e.g.,
preStop) to coordinate shutdown. - Deeper details are covered in chapters on storage and stateful applications.
Tuning readiness and termination behavior is essential for true zero-downtime rolling updates.
Blue-green and canary as alternatives
Rolling updates are not the only strategy:
- Blue-green:
- Run old (
blue) and new (green) versions side-by-side. - Switch traffic entirely at once using routing rules.
- Provides a clear, quick rollback path (switch back to blue).
- Canary:
- Send a small fraction of traffic to the new version.
- Gradually increase if no issues appear.
OpenShift’s rolling update feature can be combined with these patterns (for example, by using multiple Deployments and Routes), but the implementation details of blue-green and canary are typically addressed in more advanced deployment chapters.
Best practices for rolling updates and rollbacks
- Treat your workload definitions (
Deployment,DeploymentConfig) as version-controlled code. - Use clear version tags for images (avoid mutable tags like
latestin production). - Always define readiness and liveness probes appropriate to your application.
- Start with conservative rollout parameters:
maxUnavailable: 0and a smallmaxSurgefor critical services- Relax these as you gain confidence and need faster rollouts.
- Regularly test rollback procedures in non-production environments so the process is familiar and reliable.
- Monitor not just rollout status, but also:
- Application-level metrics (latency, error rates)
- Logs during and after rollouts
- Keep
revisionHistoryLimitat a sensible value so you have meaningful rollback targets without uncontrolled history growth.
Using rolling updates and rollbacks correctly lets you ship changes frequently while maintaining stability and user experience—one of the core advantages of OpenShift-based application deployment.