Kubernetes requests & limits: practical sizing (and cost impact)

Scheduling is based on requests. That is why most capacity planning starts with requests and then uses limits to reason about burst and risk. If you mix them up, you usually end up with oversized clusters (or unpredictable performance).

1) Requests drive node count

A simple approach: total requests = pods x per-pod request. Then divide by allocatable per node. Our calculator does that and uses the larger of CPU-based and memory-based counts.

Tool: Kubernetes Requests & Limits Calculator

2) Leave allocatable headroom

Nodes are not 100% allocatable. System overhead, daemonsets, and kubelet reservations reduce usable capacity. Planning with 85-95% allocatable is common depending on your environment.

  • If you run many daemonsets or have strict headroom targets, use a lower allocatable %.
  • If you have a stable workload and validated overhead, you can increase allocatable %.

3) CPU vs memory: why one of them usually dominates

A cluster can be CPU-bound or memory-bound. The safe workflow is to calculate node count from CPU requests and from memory requests, then take the larger number.

  • CPU-heavy services: watch throttling and p95 latency when requests are too low.
  • Memory-heavy services: watch OOM kills when limits are too tight (and watch wasted memory when requests are too high).

4) Limits matter for burst behavior

  • CPU limits can throttle bursts (good for fairness, bad if you rely on bursts for latency).
  • Memory limits can cause OOM kills if pods exceed limits (risk and instability).

Limits help you manage risk, but they are not a stable baseline for node count unless your workload frequently runs at the limit.

5) Two constraints people forget (and then undercount nodes)

  • Max pods per node: CNI/IP limits and kubelet settings cap pods/node even if CPU/memory looks fine.
  • Topology constraints: zone spread, affinities, taints, and disruption budgets reduce packing efficiency.

Worked sizing template (copy/paste)

  1. Pick representative per-pod requests (baseline, not peak).
  2. Compute total CPU and memory requests = pods x per-pod requests.
  3. Compute allocatable per node = node capacity x allocatable%.
  4. Compute nodes_cpu and nodes_mem, then take max(nodes_cpu, nodes_mem).
  5. Add a peak scenario (deployments, incident retries, seasonal traffic) and compare.

Common sizing pitfalls

  • Ignoring daemonset overhead: per-node agents eat capacity (logging, CNI, monitoring).
  • Forgetting max pods per node: IP limits and kubelet settings can cap pods/node.
  • Using peak traffic 24/7: budget with average usage, sanity-check with peak scenarios.

Next: turn node count into dollars

After you estimate node count, price it with Kubernetes Node Cost Calculator and then add other line items using the Kubernetes cost checklist: what to include.

Sources


Related guides

Kubernetes requests vs limits: why requests drive node count (and cost)
A practical explanation of Kubernetes requests vs limits for capacity planning and cost estimation, with common mistakes, a worked sizing workflow, and links to calculators.
EKS node sizing: requests, overhead, and why packing is never perfect
A practical EKS node sizing guide: size from requests, reserve headroom, account for DaemonSets and max-pods limits, and understand why real scheduling often needs more nodes than the math minimum.
EKS vs GKE vs AKS cost: a practical comparison checklist (beyond node price)
Compare managed Kubernetes costs across EKS, GKE, and AKS by modeling the same line items: nodes, control plane, load balancers, storage, observability, and egress. Includes a worksheet template and validation steps for baseline vs peak.
Kubernetes cost calculator (cluster pricing checklist)
A practical checklist for estimating Kubernetes costs: node compute, control plane, load balancers, storage, egress, and observability. Includes a fast workflow, calculator links, and common pitfalls.
Kubernetes cost model beyond nodes: the checklist most teams miss
A practical Kubernetes cost model checklist: control plane, load balancers, storage, logs/metrics, and egress - plus links to calculators to estimate each part.
Kubernetes costs explained: nodes, egress, load balancers, and observability
A practical Kubernetes cost model: node baseline, cluster add-ons, load balancers, egress/data transfer, and logs/metrics. Includes the most common mistakes and the best calculators.

Related calculators


FAQ

Why do requests matter for cost?
Requests drive scheduling and therefore node count. Node count drives compute spend, which is usually the largest Kubernetes cost line item.
What allocatable percentage should I use?
A common planning range is 85-95% depending on system reservations, daemonsets, and required headroom. Start conservative and refine with real cluster metrics.
Should I use limits to size nodes?
Not as the baseline. Limits are about burst behavior and risk; sizing from limits typically overestimates capacity unless your workload frequently runs at the limit.

Last updated: 2026-01-27