Azure Cosmos DB pricing: a practical estimate (RU/s, storage, and egress)
Cosmos DB pricing becomes manageable when you separate the drivers: capacity (RU/s), storage, and data movement. Most surprises happen when RU demand is bursty, partitions are skewed (hot partition), or retries amplify read/write volume.
0) Define the scope (containers, regions, and workloads)
- Top containers: identify the few containers that handle most traffic (budget them separately).
- Regions: multi-region reads/writes can change both RU patterns and transfer.
- Peak windows: deploys, backfills, and incidents often define the RU peak.
1) Capacity (RU/s)
RU/s is capacity planning. Build an estimate from the operations that dominate your workload (reads, writes, queries) and use a peak-demand window. If you're migrating, keep two scenarios: baseline and high-usage (peak) rather than one average.
- Hot partitions raise RU needs: one key or one partition range can dominate.
- Retry multiplier: if a request is retried 3x during incidents, RU and request volume can triple.
- Query shape matters: "simple point reads" vs "wide scans" can have very different RU behavior.
2) Storage (GB-month)
Storage is usually the straightforward line. Use average stored GB and model growth separately. If you retain history or store large documents, storage becomes material over time.
Tools: Storage pricing, Storage growth.
3) Networking and egress
If your application serves users across geographies or reads across regions, data transfer can be meaningful. Model outbound GB/month by destination and do not blend internet and cross-region paths.
Tool: Data egress cost.
Worked estimate template (copy/paste)
- RU baseline = typical peak RU/s for top containers
- RU peak = incident/backfill peak RU/s (keep separate)
- Storage = avg GB-month + growth (GB/month)
- Transfer = internet egress GB/month + cross-region GB/month (modeled separately)
Common pitfalls
- Estimating RU from averages and ignoring peaks (capacity is set by peak demand).
- Ignoring skew: one hot partition or one hot query dominates RU.
- Retry storms: timeouts and client retries multiply RU consumption.
- Blending all containers into one model (budget the top containers separately).
- Missing transfer boundaries in multi-region deployments.
How to validate the estimate
- Validate RU consumption and throttling during a representative peak window.
- Validate retry behavior and incident multipliers (retries multiply RU and requests).
- Validate growth rate and retention policies for stored data.
- Reconcile your model against one billing cycle and keep the peak scenario.