Cloud Monitoring metrics pricing (GCP): time series, sample rate, and retention

Metrics systems are "time series × frequency × retention". Costs spike when you accidentally create too many unique time series (high cardinality) or when dashboards/alerts query wide windows frequently. A good estimate makes cardinality explicit instead of hoping it stays small.

0) Define what a time series is

A time series is a unique metric name plus a unique combination of dimension/label values. If you add dimensions like pod, container, path, or customerId, the number of unique combinations can explode.

1) Estimate cardinality (time series count)

Model cardinality explicitly. A simple approximation is: series ~= metrics * (dim1_values * dim2_values * ...).

  • Safe dimensions: environment, region, service (bounded sets).
  • Dangerous dimensions: requestId, userId, URL path, pod name (unbounded or high churn).
  • If you need per-entity detail, consider sampling or aggregating before emitting metrics.

Tool: Metrics time series cost calculator.

2) Sample rate (frequency)

Sample rate multiplies ingestion volume. Going from 60s to 10s is a 6× increase. Model both a "normal" and a "high-frequency" scenario and justify why you need high frequency.

3) Retention

Retention is a storage multiplier. Long retention can be expensive if you store high-resolution data for months. A common pattern is: keep high-res for days, keep downsampled aggregates for weeks/months.

4) Dashboards and alerts (repeated queries)

Dashboards refreshing frequently and alerts scanning wide windows can create repeated query load. Treat refresh rates and window sizes as explicit drivers.

  • A dashboard refreshing every minute is 1,440 refreshes/day.
  • An alert evaluating every minute with a 24h window repeatedly re-scans the same historical data.

Worked estimate template (copy/paste)

  • Time series = metrics × product(dim value counts)
  • Samples/month = time series × samples/minute × minutes/month
  • Retention = retention days (split high-res vs downsampled if applicable)
  • Query load = dashboards/day + alerts/day (include refresh cadence)

Common pitfalls

  • Unbounded or high-churn dimensions causing cardinality explosions.
  • Using high-frequency sampling everywhere instead of only where it adds value.
  • Keeping long retention for high-resolution data by default.
  • Dashboards/alerts querying wide windows with very frequent refresh.
  • Emitting per-request metrics instead of aggregating.

How to validate

  • List top dimensions and estimate unique value counts (bounded vs unbounded).
  • Validate emit/scrape intervals across environments (dev often differs from prod).
  • Audit dashboards: refresh intervals, time windows, number of panels (queries multiply).
  • Audit alerts: evaluation frequency and window sizes (avoid repeated wide scans).

Related tools

Sources


Related guides

Cloud Logging pricing (GCP): ingestion, retention, and query scans
A practical model for Cloud Logging costs: GB ingested, retention storage (GB-month), and query/scan behavior. Includes a fast method to estimate GB/day from events/sec × bytes/event and a checklist to find dominant sources.
Metrics and monitoring costs explained: series, cardinality, and retention
A practical framework to estimate metrics bills: number of unique time series (cardinality), retention, dashboards/alerts, and the fastest levers to reduce cost safely.
Cloud Spanner cost estimation: capacity, storage, backups, and multi-region traffic
Estimate Spanner cost using measurable drivers: provisioned capacity (baseline + peak), stored GB-month (data + indexes), backups/retention, and multi-region/network patterns. Includes a worked template, common pitfalls, and validation steps.
Cloud SQL pricing: instance-hours, storage, backups, and network (practical estimate)
A driver-based Cloud SQL estimate: instance-hours (HA + replicas), storage GB-month, backups/retention, and data transfer. Includes a worked template, common pitfalls, and validation steps for peak sizing and growth.
Artifact Registry pricing (GCP): storage + downloads + egress (practical estimate)
A practical Artifact Registry cost model: stored GB-month baseline, download volume from CI/CD and cluster churn, and outbound transfer. Includes a workflow to estimate GB-month from retention and validate layer sharing and peak pull storms.
Azure Monitor metrics pricing: estimate custom metrics, retention, and API calls
A practical metrics cost model: time series count, sample rate, retention, and dashboard/alert query behavior. Includes validation steps to avoid high-cardinality mistakes.

Related calculators


FAQ

What usually drives metrics costs?
Cardinality (number of time series) is the biggest risk. Sample rate and retention multiply volume, and dashboards/alerts can create repeated query load.
How do I estimate quickly?
Estimate time series count, samples per minute, and retention days. Then validate your top dimensions (labels) and cap any unbounded dimensions.
What is the most common mistake?
Adding an unbounded label like customerId, requestId, pod name, or path and accidentally creating a time series explosion.
How do I validate?
Validate dimension cardinality, validate emit/scrape intervals, and audit dashboard refresh and alert evaluation frequency.

Last updated: 2026-01-27