Estimate Logs Insights scanned GB (from query habits)
Start with a calculator if you need a first-pass estimate, then use this guide to validate the assumptions and catch the billing traps.
This page is the scanned-GB measurement workflow, not the bill-boundary page: the goal is to turn log volume, query time range, query frequency, and incident behavior into a defendable scanned-GB-per-month model. Logs Insights is typically priced by GB scanned.
If you still are not sure which costs belong inside the Logs Insights bill versus beside it, go back to the pricing guide first. Open the pricing guide.
Step 1: estimate log volume (GB/day)
- Use measured ingestion GB/day if you have it (best).
- If not, estimate from requests/day × bytes/log and convert to GB/day.
- Separate noisy success logs from error/security logs; they behave differently.
Related: CloudWatch Logs pricing.
Step 2: estimate scanned GB per query
First-order approximation (single log group): scanned GB/query ~= (GB/day) × (query_hours / 24).
- If you query 1 hour of data, query_hours/24 ~= 1/24.
- If you query 7 days, query_hours/24 ~= 7.
- If you query multiple log groups, add them (or use total GB/day for those groups).
Step 3: estimate query frequency (the hidden driver)
- Dashboards: users/day × dashboard views/day × refreshes/view.
- Ad-hoc queries: engineer queries/day (incident days can be 10–50× higher).
- Scheduled jobs: recurring searches and reports that run automatically.
Turn it into a monthly estimate
Monthly scanned GB ~= scanned GB/query × queries/day × 30.4
Worked example (planning)
- Ingestion for queried groups: 60 GB/day
- Typical time range: 2 hours -> query_hours/24 ~= 0.083
- Scanned GB/query ~= 60 * 0.083 ~= 5 GB/query
- Queries/day: 200 (dashboards + ad-hoc)
- Monthly scanned ~= 5 * 200 * 30.4 ~= 30,400 GB scanned/month
Treat this as an estimate and validate with measured scan data as soon as possible.
Incident multiplier (simple planning)
If you have 2 incident days per month where query volume is 10× higher and time ranges are wider, you can add a small “incident add-on” instead of pretending every day is identical.
- Incident add-on ~= (scanned/day during incidents - normal scanned/day) × incident days
Evidence pack before you trust the model
- Volume baseline: measured or estimated GB/day for the queried log groups.
- Time-window evidence: the default range used by dashboards and ad-hoc queries.
- Query-frequency evidence: views, refreshes, scheduled jobs, and human query habits.
- Incident evidence: how much wider and more frequent queries become during outages or investigations.
Common pitfalls
- Using a “last 30 days” default window for routine dashboards.
- Ignoring incident behavior (many repeated broad searches).
- Scanning noisy success logs to answer a question about errors.
- Not separating environments (prod vs staging) when modeling queries.
Validation checklist
- Measure actual scanned GB from a representative week once you have access.
- Validate dashboards: time range, refresh rate, and number of queries executed per view.
- Validate which log groups are included in common queries (scope is the main lever).
Once the scanned-GB model is believable, move to the optimization guide and change one scan driver at a time.
Sources
- CloudWatch pricing: aws.amazon.com/cloudwatch/pricing