Azure Application Gateway pricing: how to model L7 load balancer costs
L7 gateway bills usually scale with time, traffic processed, and sometimes requests. The main mistake is to budget only "one gateway" without modeling traffic and log volume (or missing peak windows).
Application Gateway pricing inputs
- Gateway hours: base hourly charges.
- Capacity units: throughput and new connections.
- Data processed: GB/month through the gateway.
0) Define the gateway scope
- Environments: prod + staging + dev (hours multiply quickly).
- Public vs private: internet-facing gateways often have meaningful egress and WAF/log exposure.
- Baseline vs peak: gateways are designed for peaks; cost and reliability both depend on peak behavior.
1) Time (gateway-hours)
Start with the number of gateways and hours per month. If you run multiple regions or multiple environments, model them separately.
2) Requests and response sizes (traffic processed)
Request volume and response sizes determine traffic processed. If you only know RPS, convert to monthly requests, then estimate average response size to get GB/month.
Tools: RPS to monthly requests, Response transfer.
3) Data transfer (egress)
If the gateway sends data to the internet, outbound transfer is often billed separately and can dominate cost for high-bandwidth apps. Keep CDN bandwidth separate from origin egress if you front the gateway with a CDN.
Tool: Data egress cost calculator.
4) WAF and logging (the "second bill")
If WAF and access logs are enabled, model them separately. WAF is request-driven; logs are volume-driven (GB ingested + retention + query scans). During incidents, both can spike.
Tools: Log cost, Request-based pricing.
Worked estimate template (copy/paste)
- Gateway baseline = gateways * hours/month
- Requests/month = baseline + peak (include retries)
- GB/month = requests/month * avg response size (GB)
- Logs = (bytes/request) * requests/month + retention + query scans
Common pitfalls
- Estimating only average traffic and missing peak spikes (deploys, incidents, bots).
- Ignoring retry storms that multiply requests and logs.
- Double-counting CDN bandwidth and origin egress as the same GB.
- Turning on verbose access logs without retention and sampling guardrails.
- Budgeting prod only and forgetting staging/dev gateways.
How to validate the estimate
- Validate peak vs average RPS/GB (capacity and cost depend on peaks).
- Validate response sizes by sampling your largest endpoints separately.
- Confirm whether WAF and access logs are enabled and what retention window applies.