S3 data transfer costs: egress, cross-region access, and common surprises
Start with a calculator if you need a first-pass estimate, then use this guide to validate the assumptions and catch the billing traps.
For many storage-heavy systems, S3 data transfer (egress) becomes the largest line item long before storage does. This guide helps you model transfer in a way that avoids double-counting and missed costs.
Step 1: Map the transfer boundary (draw the path)
Transfer estimates fail when you do not define what is "inside" vs "outside" the billed boundary. Start by drawing a simple path for a real request: client -> CDN (optional) -> origin (S3) -> compute -> other regions -> external systems.
- Internet egress: S3 to the public internet (downloads to users, customers, partners)
- CDN origin egress (cache fill): S3 to CDN on cache misses, revalidation, and edge fills
- Cross-region reads: app in Region A reading a bucket in Region B
- Replication/copy traffic: CRR/SRR, batch copies, migrations, backfills
- Private networking paths: VPC endpoints, inter-AZ traffic, and "inside the cloud" flows that still show up as transfer usage types
Step 2: Estimate GB/month (3 practical methods)
Choose the method that matches what you can measure today. You can refine later.
- Throughput method: if you have Mbps charts (APM, CDN analytics, load balancer metrics), convert Mbps to GB/month with Units Converter.
- Request method: if you know requests/day and average bytes served, compute GB/month. Tool: API response transfer.
- Billing/log method: prefer measured bytes from CUR/Cost Explorer, S3 access logs, or CDN metrics. This avoids unit mistakes.
Step 3: Price the right line item (egress vs cross-region vs CDN)
- Internet egress: data egress cost
- Cross-region: cross-region transfer
- CDN bandwidth: CDN bandwidth
A common pattern is paying both: (1) CDN bandwidth (edge-to-user) and (2) origin egress (S3-to-CDN) on cache fill. They are not the same GB.
Worked estimate template (copy/paste)
- User downloads = downloads/month * average object size (GB)
- CDN cache fill = cache misses/month * average object size (GB) + revalidation traffic (if meaningful)
- Cross-region reads = reads/month * average read size (GB) for each region boundary
- Replication = changed data replicated (GB/month) + one-time backfill (if any)
Common pitfalls (transfer is easy to mis-price)
- Separate CDN bandwidth (edge-to-user) from S3 origin egress (S3-to-CDN cache fill). Guide: origin egress vs CDN bandwidth.
- Mixing replication traffic with user downloads. Replication is driven by writes/churn, not reads.
- Forgetting the "inside cloud" boundary (cross-region reads, private networking, NAT/egress paths).
- Using average object size without checking compression, partial responses, or range requests.
- Unit errors: GB vs GiB and Mbps vs MB/s (use the Units Converter if you're unsure).
How to validate the estimate
- In CUR/Cost Explorer, group by usage type and confirm your model maps to real transfer usage lines.
- Compare your estimated GB/month to CDN analytics bytes served and to S3 access logs (if enabled).
- Spot-check a top 1-3 endpoints/prefixes to confirm average response size assumptions.