NAT gateway costs: why they spike and how to estimate them
Start with a calculator if you need a first-pass estimate, then use this guide to validate the assumptions and catch the billing traps.
Use this page when you need to decide what belongs inside the NAT Gateway bill before you debate endpoints, retries, or traffic-shape fixes. NAT gateway bills are often surprising because they combine a fixed hourly baseline with traffic-based processing charges, but this page is about bill scope before workflow or optimization.
What belongs inside the NAT Gateway bill
This guide is about bill boundaries: gateway-hours, processed-GB charges, and the adjacent cross-AZ transfer, internet egress, private-connectivity, and downstream traffic costs that should be tracked beside NAT Gateway rather than blended into it.
- Inside the bill: NAT Gateway hourly charges and processed traffic charges.
- Beside the bill: egress, transfer spillover, endpoint alternatives, and downstream service costs.
- Decision point: separate the bill first, then decide whether you need measurement or intervention.
NAT cost inputs
- Hours: one NAT per AZ is typical.
- Processed GB: API calls and downloads drive this.
- Endpoints: VPC endpoints can reduce NAT GB.
The cost model (what to budget)
- Gateway-hours: gateways × hours/month (730 for always-on)
- GB processed: total GB traversing NAT per month
- Total: gateway-hours × $/hour + GB processed × $/GB (plus any related transfer/egress lines)
Tool: NAT Gateway cost calculator
Why NAT costs spike (the common root causes)
- Container image pulls: large images pulled by many nodes/tasks, especially during scaling events.
- OS/package updates: fleets doing repeated downloads through NAT.
- External APIs/SaaS: high-throughput outbound calls from private workloads.
- Log shipping: exporting logs to external destinations through NAT.
- Retry storms: a small outage can multiply outbound traffic and processed GB.
How to estimate GB processed (3 practical methods)
- From NAT gateway metrics: sum bytes over a representative window and scale to monthly.
- From VPC Flow Logs: filter to NAT gateway ENIs and sum bytes.
- From throughput charts: convert average Mbps to GB/month (good for a first pass).
Step-by-step: estimate NAT GB processed
When this is not the right page
- If you need to build a defendable processed-GB model, go to the estimate guide.
- If you already know the dominant driver and need to change production behavior, go to the optimization guide.
- If you are comparing NAT against private alternatives, treat that as a separate decision after the bill boundary is clear.
Architecture gotchas (where costs hide)
- Non-prod always-on: dev/test running 730h/month creates baseline gateway-hours.
- Accidental NAT path to AWS services: traffic to AWS APIs can still go through NAT when endpoints/private access aren’t used.
- Cross-AZ routing: centralized egress patterns can introduce cross-AZ transfer in addition to NAT.
- Multi-AZ HA choices: more NAT gateways can improve locality/availability but increases gateway-hours—model the trade-off explicitly.
If you’re comparing NAT vs endpoints/private connectivity: NAT vs VPC endpoints cost.
Validation checklist (do this after changes)
- Confirm gateway-hours didn’t stay flat due to unused gateways left running.
- Confirm GB processed dropped and identify which traffic source changed.
- Check for shifted costs: data transfer/cross-AZ and internet egress can move when routing changes.
- Re-check incident windows: if retries still spike, the problem will return.