Azure API Management pricing: model requests, transfer, and log volume
API Management pricing is easiest to estimate with a driver model: requests, transfer, and logs. Underestimates usually come from retry storms, large payload endpoints, and unbounded access logs.
0) Define what you are protecting
- Which APIs: not all endpoints have the same payload sizes or traffic.
- Baseline vs peak: incidents and deploys can multiply retries and traffic.
- Logging plan: always-on full logs vs targeted sampling (logging can become the second bill).
1) Requests (per month)
Start from RPS, convert to monthly requests, then split endpoints into at least two buckets if response sizes vary significantly (for example: small JSON vs large exports).
Tools: RPS to monthly requests, API request cost.
2) Response transfer (GB/month)
Estimate transfer from response size and request volume. If you front APIs with a CDN, keep CDN edge bandwidth separate from origin egress and avoid double-counting.
Tools: Response transfer, Egress cost.
3) Logs and observability
Access logs and application logs scale with requests. A simple estimate is: bytes logged per request * monthly requests. If you store logs for weeks, retention becomes a large secondary driver.
Tools: Log ingestion, Retention storage.
4) Retry behavior (the hidden multiplier)
- Timeouts and client retries multiply requests and transfer.
- Failed authentication loops can create high request rates with small payloads (cheap transfer, expensive requests/logs).
- Backfills and replays often look like a one-time event but run for weeks.
Worked estimate template (copy/paste)
- Requests/month = baseline + peak (include retries)
- Transfer GB/month = requests/month * avg response size (GB) (split "large endpoints" separately)
- Logs GB/month = requests/month * bytes logged per request / 1e9 (approx)
Common pitfalls
- Using one average response size across endpoints with wildly different payloads.
- Ignoring retry storms (requests and logs multiply during incidents).
- Double-counting CDN bandwidth and origin egress as the same GB.
- Assuming logs are small; high-cardinality logs can explode ingestion.
- Estimating only the gateway and ignoring downstream cost (compute, databases, and logs).
How to validate the estimate
- Validate request volume from a representative week and scale to monthly.
- Sample real responses to confirm average payload sizes for your top endpoints.
- Validate log bytes/request and retention windows (and dashboard scan windows).