gp3 IOPS and throughput: how to size (EBS)

With gp3, you can often choose performance more explicitly. That is great for cost control, but it also means you can accidentally over-provision IOPS and throughput. The safest sizing workflow starts from measured utilization and validates performance under realistic load.

gp3 sizing cues

  • IOPS: baseline random IOPS from workload metrics.
  • Throughput: MB/s for sequential scans or backups.
  • Volume size: GB baseline plus growth buffer.

Step 1: measure your baseline

  • IOPS utilization (average and p95) during a representative week
  • Throughput utilization (MB/s) during the same window
  • Latency and tail latency during busy periods

If you only measure averages, you will under-size; if you size for worst spikes, you will overpay.

Step 2: choose targets (planning rules)

  • IOPS target: start near p95 and add a small buffer for growth and burst.
  • Throughput target: start near p95 MB/s; throughput matters most for large sequential reads/writes.
  • Capacity: size GB for data + growth, not to “buy performance”.

Workload patterns (why IOPS vs throughput matters)

  • Small random IO: IOPS bound (databases, metadata-heavy workloads).
  • Large sequential IO: throughput bound (backups, large file operations).
  • Mixed workloads: size both and validate under real traffic.

Rule of thumb: throughput is IOPS times IO size

Rough relationship: throughput (MB/s) ~= IOPS * IO size (MB). If your IO size is small, you can be IOPS-bound even if throughput looks low.

Worked example (planning)

  • Workload p95: 3,000 IOPS with ~16 KB IO size (~0.016 MB)
  • Throughput ~= 3,000 * 0.016 ~= 48 MB/s
  • If you move to 128 KB IO size (~0.125 MB), throughput at 3,000 IOPS would be ~375 MB/s

This is why workload behavior (random vs sequential, IO size) matters more than a single "IOPS" number.

Common pitfalls

  • Setting IOPS and throughput from a single peak event (over-provisioning).
  • Ignoring snapshots and restore patterns that create temporary throughput bursts.
  • Changing settings without a canary and without tracking latency regressions.
  • Overlooking application bottlenecks (CPU, network) and blaming the volume.

Validation checklist

  • Validate performance under a busy window (deploy or batch job) after changes.
  • Track latency distribution (p50/p95/p99) for the workload, not just IOPS.
  • Re-check after 1–2 weeks to ensure growth does not remove safety buffer.

Sources


Related guides


Related calculators


FAQ

What is the common gp3 cost mistake?
Provisioning IOPS and throughput far above what the workload uses. Always start from measured utilization and size for p95, not worst-case peaks.
Should I size for peak or average?
For performance, size for p95/p99 and validate during busy windows. For cost, avoid using a single worst-case spike as a baseline unless it happens regularly.

Last updated: 2026-02-07