Estimate log ingestion volume (GB/day): fast methods + validation
Most log pricing models start with ingestion volume: how many GB of logs you send per day. If you do not have a clean export yet, you can still estimate GB/day with a few practical methods and validate quickly once you have real telemetry.
Method 1: From your vendor usage export (best)
If you already have a bill or usage dashboard, take the average daily ingestion over a representative window (7 or 30 days). This is the most accurate planning input.
- Prefer a window that includes a normal day and a peak day.
- If your workload is seasonal, keep separate baseline and peak months.
Method 2: From events per second and average event size
If you can estimate event rate and average event size, you can estimate GB/day with this formula (decimal GB):
- GB/day ~= events/sec × avg bytes/event × 86,400 ÷ 1,000,000,000
Tool: Log ingestion cost calculator (includes event-rate conversion).
- Sample real logs to estimate bytes/event (do not guess a single number for everything).
- Split by source: access/ingress, application logs, audit/security logs.
- Keep a peak multiplier for incidents (errors and retries increase logs dramatically).
Method 3: From throughput (Mbps)
If you have throughput charts for your log shipper or exporter, convert average throughput into GB/day. Make sure you distinguish Mbps (bits) from MB/s (bytes).
Tool: Unit converter.
What to include (and what to separate)
- Include: access logs, application logs, audit logs, infrastructure logs (kube, systemd), security logs (WAF/firewall).
- Separate: one-time migrations, bulk debug dumps, and synthetic monitoring (model as special cases).
Common pitfalls (why estimates miss)
- Duplicate shipping: two agents ship the same logs (doubles ingestion).
- Verbose debug logs: one noisy service dominates total volume.
- Multiline events: stack traces can be much larger than normal log lines.
- High-volume sources: ingress/firewall/audit logs are often bigger than application logs.
- Incident spikes: retries and errors create a peak month that looks nothing like baseline.
Next: translate GB/day into dollars (and retention)
Once you have GB/day, you typically need at least two more line items: retention storage and optional scan/search.
How to validate
- Pick your top 3 log sources and measure their actual bytes/event (sample 100–1000 events).
- Validate baseline and peak separately (incident week vs normal week).
- After changes, verify ingestion GB/day moves in the expected direction and the bill follows.