SNS cost optimization (reduce deliveries and retries)
SNS costs are usually delivery-driven. The fastest way to save is to reduce deliveries (fan-out) and reduce retries (failures and timeouts). This playbook focuses on changes that reduce spend without breaking message semantics.
SNS cost levers
- Reduce fan-out: tighten subscriptions and filters.
- Retry control: fix failing endpoints to cut retries.
- Message size: smaller payloads reduce transfer.
Step 0: baseline the two volumes (publishes vs deliveries)
- Publishes/day by topic
- Deliveries/day by topic and protocol
- Failure rate and retry behavior (incident windows matter)
- Average matched fan-out after filter policies
If you need a quick estimate first: estimate deliveries
1) Reduce fan-out (deliveries) without losing routing clarity
- Use filter policies: deliver only relevant messages to each subscriber.
- Split topics by intent: narrow high-fan-out topics so subscribers don’t receive irrelevant events.
- Avoid broadcast payloads: send a pointer to an object instead of large payloads to many subscribers.
- Remove dead subscriptions: stale endpoints create failures and retries.
2) Reduce retries by fixing endpoints (retries are a cost + reliability incident)
- Fix slow/erroring HTTP endpoints that cause repeated delivery attempts.
- Use dead-letter queues for failed deliveries where appropriate.
- Set timeouts and backoff intentionally; don’t let failures keep retrying forever.
- Alert on increased failure rate early; it’s cheaper to fix the endpoint than to pay for retries.
3) Control alert storms
- Dedupe: one incident should produce a bounded number of notifications.
- Rate-limit: cap notifications per key (service, customer, incident id).
- Group: prefer “digest per minute” for noisy signals instead of per-event alerts.
4) Validate that savings didn’t move elsewhere
Optimizing SNS can shift load to downstream systems (SQS consumers, HTTP endpoints). Validate the end-to-end cost:
- SQS request volume (if SNS fans out to SQS)
- Downstream API costs and logging volume (retries often multiply both)
- Error rates and latency (don’t “save” by dropping messages silently)
Quantify before/after
- Deliveries saved = deliveries/day before − deliveries/day after
- Retry reduction = failed deliveries before − after
- Fan-out reduction = matched subscribers per publish before − after
Tool: SNS cost calculator
Common pitfalls
- Reducing publishes but leaving fan-out unchanged (deliveries stay high).
- Adding filter policies but not validating match rate (no real fan-out reduction).
- Ignoring failures; retries can dominate “quiet” topics during incidents.
- Building alert systems without dedupe/rate limits (inevitable storm later).
- Not removing stale subscriptions (they create ongoing failure/retry cost).
Related guides
Sources
Related guides
Estimate SNS deliveries per month (messages x subscribers)
A practical workflow to estimate SNS delivery volume: start from publishes, model matched fan-out (after filter policies), add a retry multiplier for failures, and validate with metrics so budgets survive incidents.
SNS pricing: what to model (publishes, deliveries, fan-out)
A practical SNS pricing checklist: publish requests, delivery requests by protocol mix, fan-out after filter policies, and the retry/alert-storm patterns that create surprise delivery volume.
AWS SQS cost optimization (high-leverage fixes)
A practical playbook to reduce SQS costs: reduce requests per successful message with batching and long polling, prevent retry storms and poison loops, and validate savings with sent/received/deleted metrics.
Estimate SQS requests (from messages and retries)
A practical workflow to estimate billable SQS request volume: start from messages/month, model requests per successful message (Send/Receive/Delete), and add the multipliers (retries, empty receives, poison loops) that cause spikes.
SES cost optimization (reduce volume, retries, and payload)
A practical playbook to reduce AWS SES costs: prevent duplicate sends, control retries and alert storms, reduce non-prod waste, and keep payloads small when they matter — with validation steps to protect deliverability.
SQS vs SNS cost: how to compare messaging unit economics
Compare SQS vs SNS cost with a practical checklist: request types, retries, fan-out, payload transfer, and the usage patterns that decide the bill.
FAQ
What's the biggest lever for SNS cost?
Reduce deliveries: lower fan-out, use filter policies, and avoid broadcasting to many endpoints when only a subset needs the message.
How do I prevent incident-driven SNS cost spikes?
Control alert storms (dedupe, rate limit) and fix delivery failures that trigger retries. Treat high retry volume as both a reliability and cost incident.
What commonly increases deliveries?
High fan-out topics, duplicated notifications, and retries to failing endpoints. A small failure rate at high volume can create a large delivery count.
How do I validate optimizations?
Track publishes, deliveries, failure rate, and retry volume before/after. Validate that filter policies and topic splits don’t break required consumers.
Last updated: 2026-02-07