Kubernetes requests vs limits: why requests drive node count (and cost)

If you search for a Kubernetes cost calculator, you will quickly see "requests" and "limits". The short version: requests are what the scheduler uses for capacity planning; limits are guardrails for bursting and safety. Mixing them up often leads to oversized clusters (or unpredictable performance risk).

Sizing decision rules

  • Requests: use for scheduling and node count.
  • Limits: use for burst control, not capacity.
  • Overhead: include daemonsets and system reserve.

Requests: the scheduling baseline

Requests are the resources a pod asks for. Kubernetes tries to ensure that capacity exists for the sum of requests on each node (plus overhead). That makes requests the right baseline for "how many nodes do I need?"

  • CPU requests affect packing and scheduling decisions.
  • Memory requests are often the real limiter because memory is not compressible the way CPU can be.

Limits: the ceiling (risk and stability)

Limits cap how much a container can use. CPU limits can cause throttling; memory limits can cause OOM kills. Limits affect performance and risk, but they are not used for scheduling capacity the same way requests are.

  • If CPU limits are too low, p95 latency can spike when traffic bursts.
  • If memory limits are too low, OOM churn can create retries, errors, and extra traffic.

Common mistakes (and how they inflate cost)

  • Using limits as requests: many teams set limits to 2-4x requests for burstability; treating that as a baseline inflates node estimates.
  • Ignoring overhead: kube-system, daemonsets, and headroom reduce allocatable capacity.
  • Assuming perfect packing: affinities, max pods/node, and topology spread constraints raise node count beyond the math minimum.
  • Using peak traffic 24/7: plan baseline and peak as separate scenarios.

A simple sizing workflow

  1. Pick representative requests (baseline month, not incident peak).
  2. Estimate total requests = pods x per-pod requests (CPU and memory).
  3. Apply allocatable % to node capacity (leave headroom for overhead).
  4. Compute node count from CPU and memory, then take the larger number.
  5. Add a peak scenario and compare the delta (autoscaling vs always-on capacity).

Tool: Kubernetes Requests & Limits Calculator. Once you have node count, price it with Kubernetes Node Cost.

Related tools

Sources


Related guides

Kubernetes requests & limits: practical sizing (and cost impact)
How to size clusters from requests, choose allocatable headroom, and use limits to reason about burst risk - with a calculator, a worked template, and common pitfalls.
EKS node sizing: requests, overhead, and why packing is never perfect
A practical EKS node sizing guide: size from requests, reserve headroom, account for DaemonSets and max-pods limits, and understand why real scheduling often needs more nodes than the math minimum.
EKS vs GKE vs AKS cost: a practical comparison checklist (beyond node price)
Compare managed Kubernetes costs across EKS, GKE, and AKS by modeling the same line items: nodes, control plane, load balancers, storage, observability, and egress. Includes a worksheet template and validation steps for baseline vs peak.
Kubernetes cost calculator (cluster pricing checklist)
A practical checklist for estimating Kubernetes costs: node compute, control plane, load balancers, storage, egress, and observability. Includes a fast workflow, calculator links, and common pitfalls.
Kubernetes cost model beyond nodes: the checklist most teams miss
A practical Kubernetes cost model checklist: control plane, load balancers, storage, logs/metrics, and egress - plus links to calculators to estimate each part.
Kubernetes costs explained: nodes, egress, load balancers, and observability
A practical Kubernetes cost model: node baseline, cluster add-ons, load balancers, egress/data transfer, and logs/metrics. Includes the most common mistakes and the best calculators.

Related calculators


FAQ

Why do requests drive node count?
The scheduler places pods based on requests. When total requests exceed allocatable capacity, you need more nodes regardless of average usage.
Are limits useless for sizing?
No. Limits are useful for understanding burst risk and stability, but they are not the baseline input for node count unless the workload frequently runs near the limit.
What is the most common mistake?
Setting limits high and then treating limits as requests. That typically inflates node estimates and cost without improving reliability.

Last updated: 2026-02-07