AWS Lambda pricing (what to include)
Lambda pricing looks simple on paper, but budgets get surprised by the surrounding line items: logs, data transfer, retries, and downstream services. Use this checklist to build a realistic estimate and validate it after you deploy.
1) Start with the two core line items
- Requests: billed per invocation (often priced per 1M requests).
- Compute: billed as GB-seconds (duration × configured memory).
Tool: AWS Lambda cost calculator
2) Model duration as a range (not a single number)
Duration usually has a long tail. For planning, keep at least two scenarios:
- Typical: a representative median or steady period.
- Busy/incident: p95-ish or “known heavy window” where retries, cold starts, or downstream latency increases duration.
3) Memory settings directly change compute cost
Lambda compute cost scales with configured memory, even if your function rarely uses it. Over-allocating memory increases GB-seconds linearly. Under-allocating can increase duration and still raise cost.
- Right-size memory using measured duration vs memory experiments.
- Watch for “downstream bound” functions: adding memory might not reduce duration if the bottleneck is a database/API.
4) Add provisioned concurrency if you use it
Provisioned concurrency can improve cold-start behavior, but it adds a baseline cost. Treat it like “always-on capacity” and model it separately from on-demand invocations.
Guide: concurrency and cold starts
5) Common Lambda-adjacent costs
- Logging: ingestion + retention; large JSON logs add up quickly.
- Networking: NAT/egress and cross-AZ transfer patterns depending on architecture.
- Downstream services: databases, queues, storage; they often dominate total spend in real systems.
6) Sanity-check with unit economics
Convert your estimate into “cost per 1M requests” for a typical duration/memory combination. It makes regressions obvious and helps compare to always-on compute.
Related: Lambda vs Fargate cost
Common pitfalls
- Budgeting from a single duration number and ignoring long-tail behavior.
- Over-allocating memory “just in case” without measuring the duration trade-off.
- Retry storms inflating invocation count and total duration during incidents.
- Ignoring log volume and retention until it becomes a top driver.
- Forgetting data transfer and NAT patterns when functions run in a VPC.
How to validate after you deploy
- Confirm billed request count and billed duration match your model for a representative week.
- Check p50/p95 duration and error/retry rates; duration inflation often explains spend spikes.
- Measure log ingestion GB/day and confirm retention settings match your policy.