Load balancer costs: what to include beyond node spend
Start with a calculator if you need a first-pass estimate, then use this guide to validate the assumptions and catch the billing traps.
Use this page when you need to decide what belongs inside the load balancer bill before you debate connection tuning, CDN offload, or retry behavior. Load balancers become a major line item when you have many always-on LBs or when traffic patterns drive high capacity unit usage, but this page is about bill scope before workflow or optimization.
What belongs inside the load balancer bill
This guide is about bill boundaries: load balancer-hours, LCU or NLCU unit-hours, and the adjacent CDN, cross-AZ transfer, WAF, NAT, and downstream costs that should be tracked beside the load balancer bill rather than blended into it.
- Inside the bill: LB-hours, listener-bound usage, and capacity-unit hours.
- Beside the bill: CDN offload, transfer spillover, security tooling, and origin-side infrastructure.
- Decision point: separate the bill first, then decide whether measurement or optimization is next.
The cost model (simple and reliable)
- Fixed: load balancer count × hours/month × $/LB-hour
- Usage: avg LCU/NLCU per hour × hours/month × $/unit-hour
- Peak scenario: repeat the usage line for “busy/incident hours” if peaks happen frequently
Tooling: load balancer cost calculator and LCU/NLCU calculator.
What drives LCU/NLCU usage (the part that surprises budgets)
Capacity unit billing is usually the max of several dimensions. The exact thresholds depend on the product, but the drivers are consistent:
- New connections rate: many short-lived connections increase units.
- Active connections: long-lived connections (streaming/WebSockets) increase units.
- Bytes processed: large downloads/uploads and uncompressed payloads increase units.
- Rules/processing: complex routing can add a separate driver in some models.
Deep dive: LCU/NLCU explained and estimate LCU from metrics.
When this is not the right page
- If you need to turn metrics into a defendable units-per-hour model, go to the estimate guide.
- If you already know the dominant driver and need to change production behavior, go to the optimization guide.
- If you only need the conceptual meaning of LCU or NLCU, use the explainer page.
Common architecture patterns that inflate LB spend
- One load balancer per service (especially in Kubernetes): LB-hours scale with microservice count.
- “Everything through the LB”: large static assets and downloads that should be cached at the edge.
- Retry storms: incidents multiply requests and connections even if successful traffic is unchanged.
- Chatty clients: frequent reconnects and short timeouts increase new connections.
- Cross-AZ routing: can add transfer costs outside the load balancer line item.
How to validate the estimate (in one week)
- Count load balancers and confirm they’re intentional (remove abandoned ones).
- Pull a representative week of metrics and compute avg + p95 LCU/NLCU per hour.
- Compare peak hours: are they daily (baseline) or rare (incidents)?
- Cross-check the “top drivers” against reality: bytes processed vs connection churn vs rules.
Quick inventory worksheet (copy/paste)
- LB count: how many per environment (prod/stage/dev) and why each exists
- Hours: always-on (730) vs scheduled (business hours)
- Average units/hour: LCU/NLCU from a representative week
- Peak units/hour: p95 hour and “incident hour” scenarios
- Bytes processed: GB/hour baseline and peak
- Connection churn: new connections/sec during normal vs incident windows
This worksheet is enough to find whether you should optimize LB-hours (count) or unit-hours (traffic patterns).
Next steps
Sources
- Elastic Load Balancing pricing
- ALB CloudWatch metrics (for estimation inputs)
- NLB CloudWatch metrics (for estimation inputs)