GCP load balancing pricing: hours, requests, traffic processed, and egress
Load balancer estimates get accurate when you treat them as: hours + requests + traffic processed. For public services, internet egress is usually a separate (often larger) line item. Splitting these drivers also makes optimization obvious.
GCP load balancing inputs
- LB hours: forwarding rules and VIPs.
- Requests: some products bill per request.
- Data processed: GB/month through the load balancer.
0) Decide what you are pricing
Different load balancer types have different billing units. For budgeting, most teams can start with these measurable drivers:
- Hours: how many LBs (and environments) are always on.
- Requests: baseline + peak requests/month (include retries).
- Traffic processed: GB/month through the LB (response size is the key input).
- Egress: outbound GB/month to the internet (or cross-region if applicable).
1) Hours (count always-on infrastructure)
Count load balancers and hours per month per environment (prod, staging, dev).
- Always-on staging environments add up quickly.
- Count internal and external LBs separately if you use both.
2) Requests (baseline and peak)
Convert RPS to monthly requests and split heavy endpoints if needed.
Tool: RPS to monthly requests.
3) Traffic processed: estimate GB from response size
If you know requests/month and average response bytes, you can estimate transfer. Keep a separate line item for heavy endpoints so they do not disappear into a blended average.
Tool: Response transfer.
4) Egress: separate it from LB processing
Internet egress is often larger than LB processing for public services. If you use a CDN, keep origin egress (cache fill) separate from edge bandwidth to avoid double-counting.
Tools: Egress cost, CDN bandwidth.
- For APIs, response size (not request count) is usually the better predictor of egress.
- For file downloads, model the download endpoints separately to avoid hiding them in an average.
Worked estimate template (copy/paste)
- LB hours = LB count x hours/month (prod + non-prod)
- Requests = baseline + peak requests/month (include retries)
- Traffic processed = requests/month x avg response bytes (split heavy endpoints)
- Egress = outbound GB/month (internet + cross-region if applicable)
Common pitfalls
- Not splitting LB processing vs egress (leads to the wrong optimization).
- Using one average response size and missing a heavy-tail endpoint.
- Not modeling peak windows and retry storms (multipliers matter).
- Double-counting CDN edge bandwidth and origin egress.