EKS node sizing: requests, overhead, and why packing is never perfect

Node sizing in EKS (and Kubernetes in general) is best done from requests. Then you subtract overhead, reserve headroom, and accept that scheduling is not a perfect bin-packing problem. This page gives you a workflow that produces a defendable node count for budgeting and capacity planning.

Step 0: collect the inputs you actually need

  • Per-pod requests (CPU and memory) for each workload at steady state.
  • Replica counts (typical, not just max), plus deploy surge if you do rolling updates.
  • DaemonSets that run on every node (logging agent, metrics agent, CNI helpers).
  • Topology constraints (multi-AZ spread, anti-affinity) that reduce packing.

Tool: requests & limits calculator

Step 1: compute total requested CPU and memory

  • Total CPU requested = Σ(pod CPU request × replicas)
  • Total memory requested = Σ(pod memory request × replicas)

Include the “deploy surge” case if you run extra replicas during rollouts (that surge is real capacity).

Step 2: convert node specs into allocatable capacity

Don’t size from marketing vCPU/RAM numbers. Size from allocatable (after system reservations). If you don’t have real nodes yet, use a conservative haircut (for example, reserve a slice for the OS and kubelet).

  • Allocatable CPU per node (vCPU) and allocatable memory per node (GiB)
  • System overhead: kube-system pods and node agents

Step 3: account for per-node overhead (DaemonSets)

DaemonSets are “tax per node”. If a DaemonSet requests 200m CPU and 200Mi memory, that cost scales with node count, not pod count. Add the sum of DaemonSet requests to every node in your model.

Step 4: check max pods per node (density caps)

Many clusters hit a pod-density limit before CPU is full. Max pods depends on networking/IP/ENI constraints. If you ignore it, you’ll under-estimate node count for many small pods.

  • Pod cap check: nodes needed ≥ ceil(total pods / max pods per node)
  • Use the larger of CPU-driven, memory-driven, and pod-cap-driven node counts.

Step 5: reserve headroom for reality

The theoretical minimum assumes perfect packing and zero operational slack. Real clusters need headroom for rolling deploys, rescheduling during node drains, traffic spikes, and fragmentation.

  • Start with a utilization target like 70–85% of allocatable resources.
  • Increase headroom if workloads are bursty or autoscaling is slow/noisy.
  • Increase headroom if you enforce strict topology spread or anti-affinity.

Worked example (structure, not provider-specific pricing)

  • Total steady requests: 48 vCPU and 160 GiB memory
  • Candidate node allocatable: 7.5 vCPU and 28 GiB (after reservations)
  • Baseline: max(ceil(48/7.5), ceil(160/28)) = max(7, 6) = 7 nodes
  • Apply 20% headroom → 9 nodes (round up)
  • Then verify pod density and topology constraints; adjust upward if required

Cost follow-up: once you have node count, estimate spend with node cost.

Common pitfalls

  • Sizing from limits instead of requests (inflates nodes and cost).
  • Ignoring DaemonSet requests that consume resources on every node.
  • Assuming perfect packing despite topology spread and anti-affinity constraints.
  • Forgetting pod-density caps (max pods per node) for many small workloads.
  • Targeting 95–100% utilization and then being surprised by deploy and failure events.

How to validate against a real cluster

  • Compare requested vs allocatable utilization over a representative week.
  • Measure DaemonSet overhead per node and confirm it matches your model.
  • Check scheduling failures and pending pods: they often reveal pod caps or fragmentation.
  • After changes, validate that node-hours dropped and that reliability didn’t regress.

Related guides and calculators

Sources


Related guides


Related calculators


FAQ

Should I size from requests or limits?
Size from requests. Requests drive scheduling and node count. Limits are safety caps; sizing from limits usually overestimates.
Why does the real node count exceed the spreadsheet?
Because packing is imperfect: DaemonSet overhead, max-pods limits, topology constraints, affinities, and fragmentation reduce utilization vs the theoretical max.
How much headroom should I reserve?
A common starting point is to target 70–85% utilization of allocatable resources, then adjust based on workload variability and autoscaling behavior.
What are the hidden costs around nodes?
Load balancers, NAT/egress, cross-AZ traffic, and logging/metrics can add meaningful costs beyond node compute.

Last updated: 2026-01-27