ECS task sizing: how to pick CPU and memory (and estimate task count)

ECS task sizing is a balancing act: too small and you thrash (restarts, timeouts, throttling), too large and you pay for idle capacity. The best approach is to size from measured averages, keep headroom with a deliberate utilization target, and validate the result with a busy-week scenario.

Step 1: measure demand (use a representative window)

CPU: average and p95 usage for the service (not only peak).
Memory: average and p95 usage (include deploy/cold-cache windows).
Traffic shape: steady, bursty, or time-of-day (helps estimate average tasks).
Error/retry signals: timeouts and retries can multiply load and invalidate "normal week" sizing.

Step 2: pick per-task vCPU and memory

Choose the smallest task size that meets stability and latency targets at steady load.
Prefer scaling out (more tasks) over a single huge task when it improves utilization and reduces tail latency.
Keep memory headroom for spikes, GC, caching, and temporary buffers. Memory is often the real limiter.

A common failure mode is to pick "round numbers" (1 vCPU / 2 GB) and never revisit. Treat task size as a model input that evolves with the service.

Step 3: pick a utilization target

Utilization target is the "planning headroom" you keep so scaling and deploys work without timeouts.

Lower target (more headroom): more stable, higher cost.
Higher target (less headroom): cheaper, but more sensitive to bursts and slow dependencies.
For spiky services, separate baseline and burst capacity instead of one target.

Step 4: estimate average task count (the billing driver)

The bill tracks average running tasks over time, not the peak moment. A practical sizing model is:

tasks ~= max(
  cpu_demand / (task_vcpu * target_utilization),
  mem_demand / (task_mem_gb * target_utilization)
)

ECS task sizing calculator ECS pricing

Worked example (order-of-magnitude)

Average CPU demand: 6 vCPU
Average memory demand: 18 GB
Task size: 1 vCPU / 3 GB
Target utilization: 0.7
CPU-based tasks: 6 / (1 * 0.7) ~= 9
Memory-based tasks: 18 / (3 * 0.7) ~= 9

If you size tasks larger (2 vCPU / 6 GB) but your service is not actually CPU-bound, you can reduce task count but also reduce packing flexibility and increase idle within each task. Validate with real utilization.

Convert task sizing to cost (Fargate vs EC2)

ECS on Fargate: vCPU-hours + memory GB-hours for running tasks.
ECS on EC2: instance-hours (plus EBS and snapshots if you attach volumes).

Fargate cost EC2 cost ECS EC2 vs Fargate

Cost pitfalls that look like "bad sizing"

Noisy scaling: CPU% spikes trigger oscillation and keep average tasks high.
Retry storms: timeouts multiply requests, tasks, logs, and transfer.
Logs: ingestion + retention can exceed compute for high-traffic or verbose services.
NAT/egress: image pulls and external calls can create large variable network costs.

Full model: ECS cost model beyond compute.

Validation checklist

Validate average and p95 CPU/memory over at least 7 days (include a busy day).
Validate task count over time (average vs peak) and compare to the sizing model.
Validate latency and error rate during deploys and scaling events.
Validate non-compute: log ingestion GB/day and NAT processed GB during busy windows.

Sources

ECS task definitions: docs.aws.amazon.com
ECS service autoscaling: docs.aws.amazon.com

ECS cost model for compute, storage, and networking. Compare EC2 vs Fargate and identify real cost drivers.

DynamoDB RCU/WCU explained (with sizing examples)

A practical explanation of DynamoDB read and write capacity units (RCU/WCU): how item size affects units, how to estimate from requests, and the pitfalls that make estimates wrong.

ECS autoscaling cost pitfalls (and how to avoid them)

A practical guide to ECS autoscaling cost pitfalls: noisy signals, oscillations, retry storms, and the non-compute line items that scale with traffic (logs, NAT/egress, load balancers).

ECS cost model beyond compute: the checklist that prevents surprise bills

A practical ECS cost model checklist beyond compute: load balancers, logs/metrics, NAT/egress, cross-AZ transfer, storage, and image registry behavior. Use it to avoid underestimating total ECS cost.

ECS EC2 vs Fargate Cost Comparison

Compare ECS on EC2 vs Fargate using compute, storage, and networking drivers. When each model is cheaper.

ECS vs EKS cost: a practical checklist (compute, overhead, and add-ons)

Compare ECS vs EKS cost with a consistent checklist: compute model, platform overhead, scaling behavior, and the line items that often dominate (load balancers, logs, data transfer).

Related calculators

Data Egress Cost Calculator

Estimate monthly egress spend from GB transferred and $/GB pricing.

API Response Size Transfer Calculator

Estimate monthly transfer from request volume and average response size.

VPC Data Transfer Cost Calculator

Estimate data transfer spend from GB/month and $/GB assumptions.

Cross-region Transfer Cost Calculator

Estimate monthly cross-region transfer cost from GB transferred and $/GB pricing.

Log Cost Calculator

Estimate total log costs: ingestion, storage, and scan/search.

Log Ingestion Cost Calculator

Estimate monthly log ingestion cost from GB/day or from event rate and $/GB pricing.

FAQ

What's the fastest way to size ECS tasks?

Start from measured averages and p95s: CPU and memory used by the service over time. Choose per-task vCPU/memory, target a realistic utilization band, then estimate required task count with a sizing calculator.

Should I size from peak usage?

Use peaks to set headroom and autoscaling limits, but plan monthly cost from average capacity. Peak-only sizing usually overprovisions and increases spend.

What causes ECS cost spikes?

Over-sized tasks, scaling on noisy signals, retry storms, and ignored non-compute costs (load balancers, logs, NAT/egress).

Last updated: 2026-01-27