Log retention storage cost: steady-state GB-month explained
Log bills usually separate ingestion (GB ingested) from retention/storage (GB-month). Retention is the lever that quietly turns a stable ingestion rate into a growing long-term cost.
The steady-state model (the key idea)
If you retain logs for N days and ingest roughly GB/day, then once you've been logging longer than N days, the stored dataset is roughly:
- Stored GB ≈ GB/day x N
Storage cost is then stored GB x $/GB-month (using your provider billing model).
Retention tiers (hot vs cold)
A common pattern is to keep a short "hot" window for interactive troubleshooting and a longer "cold" window for compliance or rare investigations:
- Hot: 7-14 days at full fidelity (fast queries)
- Cold: 30-180 days archived or stored cheaply (slower retrieval)
Worked example (sanity-check your model)
Suppose you ingest 10 GB/day and retain logs for 30 days. Once you reach steady state, stored logs are roughly 300 GB. If storage is priced per GB-month, you can estimate monthly storage cost as 300 GB x ($/GB-month), then add ingestion and any query/scan costs separately.
- If you increase retention from 30 days to 90 days at the same ingest rate, stored GB roughly triples.
- If you reduce ingest (sampling, filtering, structured logs), retention cost falls linearly too.
Do not forget query/scan behavior
Storage is only one part of "retention cost". If your team frequently searches or scans logs (especially during incidents), scan/search can become a second spike. Keep scan volume and search frequency as explicit inputs.
Tools: Log scan/search cost, Log cost overview.
Fast levers (reduce cost without losing signal)
- Shorten retention for low-value debug logs; keep longer retention for sparse high-value security/audit logs.
- Sample noisy logs and reduce per-request log bytes (structured fields, fewer stack traces).
- Move cold retention to cheaper tiers/object storage if your provider supports it.
- Set defaults in IaC to prevent retention drift across new log groups/services.
Common pitfalls
- Retention drift: default policies slowly expand across more log groups.
- High-volume debug logs: long retention on verbose logs dominates storage.
- Mixing units: GB vs GiB across dashboards and pricing tables.
Tools
Sources
- CloudWatch pricing (example of ingestion + storage billing model)
- Cloud Logging pricing (example of logging pricing model)