Table of Contents
Why quotas and limits matter in OpenShift
In OpenShift, every project (namespace) shares a finite pool of cluster resources: CPU, memory, storage, and object counts (Pods, Services, etc.). Without controls, a single application or team can consume disproportionate resources, affecting others and potentially destabilizing the cluster.
Resource quotas and limits provide:
- Fair sharing between teams and projects.
- Protection against runaway workloads.
- Predictability in capacity planning.
- Guardrails that align usage with organizational policy.
This chapter focuses on how OpenShift implements resource quotas and per-container limits within projects, and how they interact.
Key concepts: quota vs. limit
At a high level:
- ResourceQuota:
A project-level object that caps the total amount of resources the project can consume, or the total number of certain objects.
Example: "This project can use at most 8 CPU cores and 32 GiB of memory." - LimitRange:
A per-pod/per-container policy that defines default and maximum/minimum values for resource requests and limits inside the project.
Example: "Each container must request at least 100m CPU and at most 2 CPU; default limit is 500m CPU if not specified."
They complement each other:
- LimitRange controls individual pod/container behavior.
- ResourceQuota controls aggregate consumption at the project level.
ResourceQuota: controlling project-wide usage
What ResourceQuota can limit
A ResourceQuota can restrict:
- Compute resources (sum across pods in the project):
requests.cpu,limits.cpurequests.memory,limits.memory- Extended resources like GPUs (e.g.
requests.nvidia.com/gpu) - Pod and object counts:
pods,replicationcontrollers,deployments,statefulsets,jobs, etc.persistentvolumeclaims,services,routes,configmaps,secrets- Storage-related resources:
requests.storage- Storage-class-specific usage (e.g.
requests.storageclassName.storage)
The exact set depends on the cluster version and enabled APIs, but the pattern is consistent: quota keys define a maximum for some measurable resource or count.
Basic structure of a ResourceQuota
A typical ResourceQuota YAML looks like:
apiVersion: v1
kind: ResourceQuota
metadata:
name: project-quota
namespace: my-project
spec:
hard:
requests.cpu: "4"
limits.cpu: "8"
requests.memory: "8Gi"
limits.memory: "16Gi"
pods: "40"
persistentvolumeclaims: "10"
requests.storage: "200Gi"Key fields:
metadata.namespace: the project where this quota applies.spec.hard: a map of resource names to maximum allowed values.
How quota enforcement works
When you create or modify a resource (e.g. Deployment, Pod, PVC), the API server:
- Calculates the effective resource requests/limits for the new or updated objects.
- Sums these with the usage of existing resources in the same namespace.
- Compares the total against the
hardlimits defined inResourceQuota.
If the new total would exceed any quota:
- The API request is rejected with an error.
- No partial creation occurs; the resource is not admitted.
Quotas are enforced at admission time (when objects are created/updated), not continuously by some background process.
Multiple quotas per namespace
You can define multiple ResourceQuota objects in a namespace. Their effects combine:
- The usage is checked against each quota object independently.
- A request is rejected if it violates any of them.
This allows more granular policy definition, for example:
- One quota for compute resources (CPU/memory).
- Another for object counts (Pods, PVCs).
- Another for storage per storage class.
LimitRange: per-container policies inside a project
While quotas cap the total, LimitRange controls the resources at the pod/container level within a namespace.
What LimitRange controls
A LimitRange can define:
- Default requests and limits for containers that omit them:
default(limits)defaultRequest(requests)- Maximum and minimum allowed requests/limits:
maxmin- Ratios (in some cases) between requests and limits (
maxLimitRequestRatio).
Limit ranges apply to:
- Containers (
type: Container) - Optionally to Pods, PVCs, or images (less common in typical app workloads).
Example LimitRange
apiVersion: v1
kind: LimitRange
metadata:
name: container-limits
namespace: my-project
spec:
limits:
- type: Container
default:
cpu: "500m"
memory: "512Mi"
defaultRequest:
cpu: "250m"
memory: "256Mi"
max:
cpu: "2"
memory: "2Gi"
min:
cpu: "100m"
memory: "128Mi"Implications:
- If a container has no resource section, it automatically gets:
requests.cpu: 250m,limits.cpu: 500mrequests.memory: 256Mi,limits.memory: 512Mi- A container cannot set:
- Less than
100mCPU or128Mimemory request, - More than
2CPU or2Gimemory limit.
If you try to create a pod whose container violates these constraints, the admission is denied.
Interaction between quotas and limits
Understanding the interaction is crucial for avoiding confusing errors.
Automatic requests/limits vs. quota
When a namespace has both:
- A
LimitRangewith default requests/limits, and - A
ResourceQuotaon requests/limits/pods,
then:
- Pods without explicit resource requests/limits get defaulted values.
- Those defaulted values count against the project’s quota.
Example:
- Quota:
requests.cpu: "2" - LimitRange defaultRequest:
cpu: "500m"
If you create 5 pods without explicit requests, each gets 500m:
- Total requests
cpu=5 × 500m = 2.5>2, so the 5th pod will be rejected.
From the developer’s perspective, it may look like they “just created pods” and hit quota unexpectedly; the defaults from LimitRange are responsible for the usage.
Common error scenarios
Typical API error messages when hitting quotas or limits:
- Quota exceeded (e.g. pods):
- Admission error:
"exceeded quota: project-quota, requested: pods=1, used: 40, limited: 40" - Quota exceeded (e.g. memory):
"exceeded quota: project-quota, requested: requests.memory=512Mi, used: 8Gi, limited: 8Gi"- LimitRange violation:
"must not specify more than 2 for cpu""must not specify less than 100m for cpu"
As a user, these messages point to either a ResourceQuota or LimitRange object by name; you can inspect them to understand the constraint.
Typical patterns for setting quotas and limits
Per-environment quotas
Organizations often apply different quotas per environment:
- Development namespaces
- Lower limits to encourage efficient testing.
- Example:
requests.cpu: 2,requests.memory: 4Gi,pods: 20. - Staging/pre-production
- Higher limits for realistic performance testing.
- Example:
requests.cpu: 8,requests.memory: 32Gi,pods: 50. - Production
- Carefully sized based on capacity planning.
- May have stricter per-pod
LimitRangeto prevent outliers.
Per-team or per-application segmentation
Quotas can be used to:
- Ensure each team has a clearly defined budget of CPU/memory.
- Prevent a single team from exhausting all PVCs or routes.
- Provide “tiers” of service (e.g. standard vs. premium namespaces with different quotas).
Working with quotas and limits as a developer
Even if you’re not a cluster admin, quotas and limits affect how you design and deploy your applications.
Specifying requests and limits explicitly
Defining resources is essential for predictable behavior:
- In your Pod or Deployment spec, for each container:
resources:
requests:
cpu: "200m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"Benefits:
- Pods are scheduled based on their actual needs.
- You avoid relying on defaults you might not control.
- It’s easier to reason about quota usage before deploying.
Estimating usage against quota
To avoid surprises:
- Estimate per-pod usage (requests/limits).
- Multiply by desired replica count.
- Compare the totals to the project’s quota.
If your Deployment will scale up (e.g. HPA), factor in the maximum replica count when checking against the quota.
Understanding failures when scaling
Scaling a Deployment or StatefulSet may fail if:
- New pods would cause resource or object counts to exceed quota.
In that case:
- Scaling commands might succeed from the CLI point of view (desired replicas updated),
- But some replicas stay in
Pendingor are never created, with quota-related errors inevents: "Error creating: pods "..." is forbidden: exceeded quota ..."
As a developer:
- Check events on the Deployment/ReplicaSet/Pod to see if quota is blocking new pods.
- Coordinate with your admin to adjust either:
- The quota, or
- Your application’s resource requests/limits or replica counts.
Viewing and understanding quotas and limits in OpenShift
You typically interact with quotas and limits via:
OpenShift web console
Within a project/namespace:
- Project/Namespace overview often shows:
- Current quota definitions and usage (CPU, memory, storage).
- Dedicated Quotas or Limits sections (names vary slightly by version) list:
ResourceQuotaobjects with used vs. hard values.LimitRangeobjects with default/min/max definitions.
You can:
- Inspect which quotas apply to your project.
- See how close you are to CPU, memory, pod, or storage caps.
`oc` CLI basics
Core commands (names may vary slightly by version, but this is the common pattern):
- List quotas:
oc get resourcequotaoc describe resourcequota <name>- List limit ranges:
oc get limitrangeoc describe limitrange <name>- See summarized usage in current namespace:
oc describe namespace $(oc project -q)
(shows quota-related info, depending on cluster configuration)
These commands help diagnose why deployments fail and how resources are being consumed.
Best practices for using quotas and limits
From a practical, day-to-day perspective:
- Always set requests and limits in your workload manifests.
- Check your project’s quotas before planning large deployments or scale-ups.
- Use realistic values based on observation and profiling, not guesses.
- Avoid “unlimited” patterns like very large limits that might violate
LimitRangeor defeat fair sharing. - Coordinate with cluster admins:
- If you consistently hit quota, review whether:
- Your app is over-requesting resources,
- Or the project’s quotas need to be increased.
Used correctly, resource quotas and limits give you predictable, stable environments on OpenShift while protecting shared infrastructure from accidental overload.