Lambda concurrency and cold starts (cost pitfalls)

Reviewed by CloudCostKit Editorial Team. Last updated: 2026-02-07. Editorial policy and methodology.

This page is the capacity-and-latency measurement page, not the bill-boundary page: the goal is to turn concurrency shape, cold-start frequency, and provisioned-concurrency windows into a defendable duration and baseline-cost model.

Concurrency and cold starts matter for both latency and cost. Cold starts often increase duration, which increases GB-seconds. Provisioned concurrency can reduce cold starts, but it adds a baseline "always-on" cost that should be justified by the latency path you are protecting.

If you still are not sure which costs belong inside the Lambda bill versus beside it, go back to the pricing guide first. Then use Lambda pricing to lock the budget boundary.

Concurrency model inputs

  • Peak RPS: highest sustained traffic window.
  • Avg duration: ms per request drives concurrency.
  • Provisioned window: hours where cold starts hurt.

Use this page to measure, not to guess

  • Measure concurrency shape: identify whether bursts are sharp, sustained, or tied to a small number of business hours.
  • Measure cold-start frequency: note how often new execution environments appear during those windows.
  • Measure duration impact: capture how much initialization time actually stretches billed duration on the affected path.
  • Measure warm-capacity need: estimate the minimum provisioned window that protects the SLA without turning the whole workload into baseline spend.

How concurrency relates to cost

  • On-demand: cost ≈ requests + GB-seconds. Concurrency is a result of traffic; it’s not a separate fee.
  • Provisioned concurrency: adds baseline cost because you pay to keep capacity warm.

Pricing checklist: Lambda pricing

Why cold starts can raise your bill

If a cold start adds extra initialization time, it increases the billed duration for that invocation. If cold starts happen often (spiky traffic, low steady usage), the “extra duration” can become a meaningful portion of monthly GB-seconds.

  • Spiky workloads: many cold starts per day/week.
  • Large bundles and heavy init: bigger cold start penalty.
  • Downstream dependency latency: cold starts often correlate with other slow paths.

A practical way to decide on provisioned concurrency

  1. Identify the latency-sensitive path (user-facing API vs background job).
  2. Measure how often cold starts happen and how much duration they add.
  3. Estimate the monthly baseline cost of provisioned concurrency for the hours you need it.
  4. Decide if the SLA/UX improvement is worth the baseline.

Tip: apply provisioned concurrency only during business hours for endpoints that need it; do not blanket-enable it before you have a measurement story.

Cost pitfalls to watch for

  • Enabling provisioned concurrency globally (adds baseline cost everywhere).
  • Keeping a tiny steady traffic pattern that triggers frequent cold starts and long-tail latency.
  • Retry storms during incidents multiplying invocations (and cold starts) in a short time window.
  • Large initialization work (loading big dependencies, scanning config) inflating duration.
  • Running functions in a VPC and then discovering NAT/egress costs and longer startup times.

What to measure (so you can validate a change)

  • Invocation count and duration distribution (p50/p95) before vs after.
  • Error rate and retries (spikes often explain spend jumps).
  • For provisioned concurrency: baseline hours enabled vs actual demand windows.
  • Log ingestion GB/day (cold start debugging often increases log volume).

Related guides and tools

When you already know the dominant driver and need production changes, move to the optimization guide instead of reusing this page as an action checklist.

Sources


Related guides


FAQ

Does concurrency itself cost money?
Not directly for on-demand Lambdas. Cost usually comes from invocations and GB-seconds. However, provisioned concurrency adds a baseline cost because you pay for pre-initialized capacity.
How do cold starts affect cost?
Cold starts often increase duration (more GB-seconds). If cold starts happen frequently, they can noticeably increase monthly compute cost and worsen latency.
When is provisioned concurrency worth it?
For latency-sensitive paths with frequent cold starts where the business value of lower tail latency justifies the baseline cost. It’s usually not worth it for spiky background jobs.

Last updated: 2026-02-07. Reviewed against CloudCostKit methodology and current provider documentation. See the Editorial Policy .