Load balancer LCU/NLCU explained (for cost estimates)

Many load balancers charge (1) a fixed hourly fee plus (2) a usage fee billed in capacity unit-hours. For budgeting, you don’t need perfect precision—you need a defendable average units/hour and a peak scenario so incident hours don’t blow up the plan.

What “capacity unit-hours” means

Think of LCU/NLCU as a normalized “how busy was this load balancer this hour?” score. The unit is typically derived from multiple dimensions, and the billed unit-hours often follow the maximum of those dimensions.

  • Connections: new connections and/or active connections
  • Bytes processed: traffic volume through the LB
  • Request processing: rules/routing work (depends on product and configuration)

Why the same RPS can produce very different unit-hours

  • Payload size: 1kB responses vs 1MB downloads are not comparable.
  • Connection churn: short timeouts and frequent reconnects inflate new connections.
  • Long-lived connections: streaming/WebSockets increase active connections.
  • Incidents: retries can multiply requests and connections without increasing “real” business volume.

A practical mental model for optimization

  • If you have many LBs, LB-hours dominate: reduce load balancer count.
  • If you have a few hot LBs, unit-hours dominate: reduce bytes processed and connection churn.
  • If you have spikes, the “peak scenario” dominates: fix retries and bot traffic.

Optimization playbook: load balancer cost optimization

How to estimate units/hour (without getting lost)

  1. Pick a representative week and a peak definition (p95 hour or incident hour).
  2. Collect driver metrics: new connections/sec, active connections, bytes processed GB/hour.
  3. Use a calculator to convert driver metrics to units/hour.
  4. Price units/hour + fixed LB hourly fee, then validate after a week.

Common pitfalls

  • Budgeting from peak unit-hours for the whole month (hides the real average).
  • Ignoring payload size and estimating from requests only.
  • Missing retry storms and noisy clients as the “hidden multiplier”.
  • Mixing units (GB vs GiB, bits vs bytes) and silently breaking the estimate.
  • Not re-checking after architecture changes (CDN, compression, routing).

Quick diagnostic: which driver is dominating?

When you look at a week of metrics, one driver is usually “the max” most of the time. You can often predict the culprit from traffic shape:

  • Bytes dominate: large responses, downloads, missing compression, no CDN offload.
  • New connections dominate: short timeouts, clients reconnecting, lack of keep-alive.
  • Active connections dominate: streaming, long polling, WebSockets.
  • Rules dominate: complex routing rules that evaluate frequently.

Once you know the dominant driver, optimization becomes targeted instead of guesswork.

Sources


Related guides


Related calculators


FAQ

What is an LCU/NLCU?
A capacity unit is a provider-defined measure of load balancer usage. Billing is commonly per unit-hour and is driven by dimensions like connections and bytes processed.
Why is capacity unit billing confusing?
Because it’s not just request count. Two systems with the same RPS can have very different unit-hours based on payload size, connection churn, long-lived connections, and rule/routing behavior.
How do I estimate unit-hours without perfect observability?
Start from a few measurable drivers (connections/sec, active connections, GB/hour). Build an average + peak scenario, then validate and replace assumptions with metrics after a week.

Last updated: 2026-01-27