SNS cost optimization (reduce deliveries and retries)

SNS costs are usually delivery-driven. The fastest way to save is to reduce deliveries (fan-out) and reduce retries (failures and timeouts). This playbook focuses on changes that reduce spend without breaking message semantics.

SNS cost levers

  • Reduce fan-out: tighten subscriptions and filters.
  • Retry control: fix failing endpoints to cut retries.
  • Message size: smaller payloads reduce transfer.

Step 0: baseline the two volumes (publishes vs deliveries)

  • Publishes/day by topic
  • Deliveries/day by topic and protocol
  • Failure rate and retry behavior (incident windows matter)
  • Average matched fan-out after filter policies

If you need a quick estimate first: estimate deliveries

1) Reduce fan-out (deliveries) without losing routing clarity

  • Use filter policies: deliver only relevant messages to each subscriber.
  • Split topics by intent: narrow high-fan-out topics so subscribers don’t receive irrelevant events.
  • Avoid broadcast payloads: send a pointer to an object instead of large payloads to many subscribers.
  • Remove dead subscriptions: stale endpoints create failures and retries.

2) Reduce retries by fixing endpoints (retries are a cost + reliability incident)

  • Fix slow/erroring HTTP endpoints that cause repeated delivery attempts.
  • Use dead-letter queues for failed deliveries where appropriate.
  • Set timeouts and backoff intentionally; don’t let failures keep retrying forever.
  • Alert on increased failure rate early; it’s cheaper to fix the endpoint than to pay for retries.

3) Control alert storms

  • Dedupe: one incident should produce a bounded number of notifications.
  • Rate-limit: cap notifications per key (service, customer, incident id).
  • Group: prefer “digest per minute” for noisy signals instead of per-event alerts.

4) Validate that savings didn’t move elsewhere

Optimizing SNS can shift load to downstream systems (SQS consumers, HTTP endpoints). Validate the end-to-end cost:

  • SQS request volume (if SNS fans out to SQS)
  • Downstream API costs and logging volume (retries often multiply both)
  • Error rates and latency (don’t “save” by dropping messages silently)

Quantify before/after

  • Deliveries saved = deliveries/day before − deliveries/day after
  • Retry reduction = failed deliveries before − after
  • Fan-out reduction = matched subscribers per publish before − after

Tool: SNS cost calculator

Common pitfalls

  • Reducing publishes but leaving fan-out unchanged (deliveries stay high).
  • Adding filter policies but not validating match rate (no real fan-out reduction).
  • Ignoring failures; retries can dominate “quiet” topics during incidents.
  • Building alert systems without dedupe/rate limits (inevitable storm later).
  • Not removing stale subscriptions (they create ongoing failure/retry cost).

Related guides

Sources


Related guides

Estimate SNS deliveries per month (messages x subscribers)
A practical workflow to estimate SNS delivery volume: start from publishes, model matched fan-out (after filter policies), add a retry multiplier for failures, and validate with metrics so budgets survive incidents.
SNS pricing: what to model (publishes, deliveries, fan-out)
A practical SNS pricing checklist: publish requests, delivery requests by protocol mix, fan-out after filter policies, and the retry/alert-storm patterns that create surprise delivery volume.
AWS SQS cost optimization (high-leverage fixes)
A practical playbook to reduce SQS costs: reduce requests per successful message with batching and long polling, prevent retry storms and poison loops, and validate savings with sent/received/deleted metrics.
Estimate SQS requests (from messages and retries)
A practical workflow to estimate billable SQS request volume: start from messages/month, model requests per successful message (Send/Receive/Delete), and add the multipliers (retries, empty receives, poison loops) that cause spikes.
SES cost optimization (reduce volume, retries, and payload)
A practical playbook to reduce AWS SES costs: prevent duplicate sends, control retries and alert storms, reduce non-prod waste, and keep payloads small when they matter — with validation steps to protect deliverability.
SQS vs SNS cost: how to compare messaging unit economics
Compare SQS vs SNS cost with a practical checklist: request types, retries, fan-out, payload transfer, and the usage patterns that decide the bill.

FAQ

What's the biggest lever for SNS cost?
Reduce deliveries: lower fan-out, use filter policies, and avoid broadcasting to many endpoints when only a subset needs the message.
How do I prevent incident-driven SNS cost spikes?
Control alert storms (dedupe, rate limit) and fix delivery failures that trigger retries. Treat high retry volume as both a reliability and cost incident.
What commonly increases deliveries?
High fan-out topics, duplicated notifications, and retries to failing endpoints. A small failure rate at high volume can create a large delivery count.
How do I validate optimizations?
Track publishes, deliveries, failure rate, and retry volume before/after. Validate that filter policies and topic splits don’t break required consumers.

Last updated: 2026-02-07