Azure Application Insights pricing: ingestion volume, sampling, and retention
Application Insights is a telemetry pipeline. Costs are predictable if you model it as: events * size -> GB ingested -> retention storage -> query scans.
Quick Application Insights estimate
- Ingestion GB/day: trace + exception + dependency payload size.
- Retention days: longer retention multiplies storage baseline.
- Sampling rate: client/server sampling changes billable GB.
0) Define what you send
- Telemetry types: requests, dependencies, exceptions, traces, custom events.
- High-volume paths: a few endpoints often generate most traffic (and most telemetry).
- Baseline vs incident: incidents multiply errors, retries, and event sizes.
1) Ingestion volume (GB/month)
A reliable first estimate is: requests * events per request * bytes per event. Model separate lines for high-volume sources rather than one blended average.
Tool: Log/telemetry ingestion.
2) Sampling (the biggest cost lever)
Sampling is the highest leverage knob. If you sample traces at 10%, ingestion can drop by ~10x for that stream. During incidents, exceptions and retries can multiply events, so sampling and filters matter.
- Prefer targeted full-fidelity for short windows (investigation) over always-on full-fidelity.
- Keep an incident plan: what gets sampled more or less when traffic spikes.
3) Retention and query scans
Retention multiplies stored GB. Dashboards and alerts can scan large windows repeatedly; model refresh frequency and scan windows explicitly so the "analysis bill" does not surprise you.
Tools: Retention storage, Scan/query.
Worked estimate template (copy/paste)
- Ingestion GB/month = requests/month * events/request * bytes/event / 1e9 (approx)
- Retention GB-month = ingestion GB/day * retention days (order-of-magnitude)
- Query scans = dashboard refreshes/day * scan window (days) * ingestion GB/day
Common pitfalls
- Using one average: a few endpoints can dominate traffic and telemetry.
- Leaving traces at full fidelity for all services and all environments.
- Not accounting for incident multipliers (errors and retries can 10x event volume).
- Dashboards scanning wide windows frequently (quiet scan cost).
- Keeping long retention by default without a clear reason.
How to validate the estimate
- Validate sampling and filters and confirm what is kept vs dropped.
- Validate incident behavior: errors and retries multiply telemetry events and ingestion volume.
- Validate dashboards/alerts for wide time windows that scan lots of data frequently.