EKS node sizing: requests, overhead, and why packing is never perfect

Node sizing in EKS (and Kubernetes in general) is best done from requests. Then you subtract overhead, reserve headroom, and accept that scheduling is not a perfect bin-packing problem. This page gives you a workflow that produces a defendable node count for budgeting and capacity planning.

Step 0: collect the inputs you actually need

Per-pod requests (CPU and memory) for each workload at steady state.
Replica counts (typical, not just max), plus deploy surge if you do rolling updates.
DaemonSets that run on every node (logging agent, metrics agent, CNI helpers).
Topology constraints (multi-AZ spread, anti-affinity) that reduce packing.

Tool: requests & limits calculator

Step 1: compute total requested CPU and memory

Total CPU requested = Σ(pod CPU request × replicas)
Total memory requested = Σ(pod memory request × replicas)

Include the “deploy surge” case if you run extra replicas during rollouts (that surge is real capacity).

Step 2: convert node specs into allocatable capacity

Don’t size from marketing vCPU/RAM numbers. Size from allocatable (after system reservations). If you don’t have real nodes yet, use a conservative haircut (for example, reserve a slice for the OS and kubelet).

Allocatable CPU per node (vCPU) and allocatable memory per node (GiB)
System overhead: kube-system pods and node agents

Step 3: account for per-node overhead (DaemonSets)

DaemonSets are “tax per node”. If a DaemonSet requests 200m CPU and 200Mi memory, that cost scales with node count, not pod count. Add the sum of DaemonSet requests to every node in your model.

Step 4: check max pods per node (density caps)

Many clusters hit a pod-density limit before CPU is full. Max pods depends on networking/IP/ENI constraints. If you ignore it, you’ll under-estimate node count for many small pods.

Pod cap check: nodes needed ≥ ceil(total pods / max pods per node)
Use the larger of CPU-driven, memory-driven, and pod-cap-driven node counts.

Step 5: reserve headroom for reality

The theoretical minimum assumes perfect packing and zero operational slack. Real clusters need headroom for rolling deploys, rescheduling during node drains, traffic spikes, and fragmentation.

Start with a utilization target like 70–85% of allocatable resources.
Increase headroom if workloads are bursty or autoscaling is slow/noisy.
Increase headroom if you enforce strict topology spread or anti-affinity.

Worked example (structure, not provider-specific pricing)

Total steady requests: 48 vCPU and 160 GiB memory
Candidate node allocatable: 7.5 vCPU and 28 GiB (after reservations)
Baseline: max(ceil(48/7.5), ceil(160/28)) = max(7, 6) = 7 nodes
Apply 20% headroom → 9 nodes (round up)
Then verify pod density and topology constraints; adjust upward if required

Cost follow-up: once you have node count, estimate spend with node cost.

Common pitfalls

Sizing from limits instead of requests (inflates nodes and cost).
Ignoring DaemonSet requests that consume resources on every node.
Assuming perfect packing despite topology spread and anti-affinity constraints.
Forgetting pod-density caps (max pods per node) for many small workloads.
Targeting 95–100% utilization and then being surprised by deploy and failure events.

How to validate against a real cluster

Compare requested vs allocatable utilization over a representative week.
Measure DaemonSet overhead per node and confirm it matches your model.
Check scheduling failures and pending pods: they often reveal pod caps or fragmentation.
After changes, validate that node-hours dropped and that reliability didn’t regress.

Related guides and calculators

EKS pricing checklist Control plane cost Requests & limits Node cost

Sources

Cache hit rate strongly influences origin requests and origin egress (cache fill). Learn a simple model, what breaks hit rate, and the practical levers to improve it safely.

ECS vs EKS cost: a practical checklist (compute, overhead, and add-ons)

Compare ECS vs EKS cost with a consistent checklist: compute model, platform overhead, scaling behavior, and the line items that often dominate (load balancers, logs, data transfer).

Estimate Secrets Manager API calls per month (GetSecretValue volume)

A practical workflow to estimate Secrets Manager API request volume (especially GetSecretValue): measure and scale when possible, model from runtime churn when not, and validate with CloudTrail so your budget survives peaks.

Kubernetes requests vs limits: why requests drive node count (and cost)

A practical explanation of Kubernetes requests vs limits for capacity planning and cost estimation, with common mistakes, a worked sizing workflow, and links to calculators.

API Gateway cost optimization: reduce requests, bytes, and log spend

A practical playbook to reduce API Gateway spend: identify the dominant driver (requests, transfer, or logs), then apply high-leverage fixes with a validation checklist.

API Gateway pricing: what to model (requests + transfer)

A practical API Gateway pricing checklist: request charges, data transfer, and the add-ons that can show up on the bill.

Related calculators

Kubernetes Cost Calculator

Estimate cluster cost by sizing nodes from requests and pricing them.

Kubernetes Node Cost Calculator

Estimate cluster monthly cost from node count and per-node hourly pricing.

Kubernetes Observability Cost Calculator

Estimate log + metrics observability costs for Kubernetes.

RPS to Monthly Requests Calculator

Estimate monthly request volume from RPS, hours/day, and utilization.

API Request Cost Calculator

Estimate request-based charges from monthly requests and $ per million.

CDN Request Cost Calculator

Estimate CDN request fees from monthly requests and $ per 10k/1M pricing.

FAQ

Should I size from requests or limits?

Size from requests. Requests drive scheduling and node count. Limits are safety caps; sizing from limits usually overestimates.

Why does the real node count exceed the spreadsheet?

Because packing is imperfect: DaemonSet overhead, max-pods limits, topology constraints, affinities, and fragmentation reduce utilization vs the theoretical max.

How much headroom should I reserve?

A common starting point is to target 70–85% utilization of allocatable resources, then adjust based on workload variability and autoscaling behavior.

What are the hidden costs around nodes?

Load balancers, NAT/egress, cross-AZ traffic, and logging/metrics can add meaningful costs beyond node compute.

Last updated: 2026-01-27