Practice/Meta/Design a Rate Limiter

Design a Rate Limiter

System DesignMust

Problem Statement

Design a scalable notification delivery platform that sends messages across multiple channels -- push notifications, SMS, email, and webhooks -- to millions of users worldwide. The system must handle billions of notifications daily, support delivery prioritization, ensure idempotency, and provide tracking for delivery status and user engagement metrics.

Consider the scale of platforms like Stripe, Twilio, or Amazon SNS that route billions of time-sensitive events. The system must gracefully handle third-party channel failures, respect user preferences and quiet hours, deduplicate redundant messages, and provide near real-time delivery status updates to sender applications. Interviewers expect you to reason about message queuing, delivery guarantees, fan-out patterns, channel-specific retry logic, and the observability needed to diagnose delivery failures across a distributed fleet.

Key Requirements

Functional

Multi-channel delivery -- System must route notifications to push, SMS, email, and webhook endpoints based on user preferences and message type
Delivery tracking -- Provide real-time status updates (sent, delivered, failed, opened, clicked) and aggregate engagement analytics for senders
User preference management -- Honor opt-out settings, channel preferences, quiet hours, and notification frequency caps per user
Template and personalization -- Support message templates with variable substitution, localization, and dynamic content rendering
Priority and scheduling -- Allow urgent messages to bypass queues while supporting scheduled delivery and batching for non-critical notifications

Non-Functional

Scalability -- Handle 10 billion notifications per day with peaks of 500,000 messages per second during high-traffic events
Reliability -- Achieve 99.95% delivery success rate with automatic retries, circuit breaking for failing channels, and fallback channel routing
Latency -- Deliver P95 of high-priority notifications within 2 seconds from API submission to channel handoff; P99 under 5 seconds
Consistency -- Guarantee at-least-once delivery semantics with idempotency keys to prevent duplicate sends; eventually consistent delivery status tracking

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Message Queuing and Fan-Out Architecture

Notifications must be distributed across workers efficiently while preventing head-of-line blocking when specific channels slow down or fail. Interviewers want to see how you partition work, isolate failures, and maintain ordering guarantees where needed.

Hints to consider:

Use dedicated queues per channel type to isolate slow SMS providers from fast push notifications
Implement priority queues with separate consumer pools to ensure urgent alerts bypass bulk marketing messages
Consider partitioning by user ID or tenant ID to maintain per-user ordering while enabling horizontal scaling
Design dead letter queues with exponential backoff and manual retry capabilities for persistent failures

2. Idempotency and Exactly-Once Processing

Users may retry API calls during timeouts, causing duplicate notification requests. Sending the same alert twice damages user trust and wastes money on SMS or third-party API calls. You must prevent duplicates across retries and crashes.

Hints to consider:

Accept client-provided idempotency keys and store them with TTL in Redis or DynamoDB to detect duplicates within a time window
Make notification IDs deterministic based on content hash plus recipient to catch duplicates from different sources
Use database constraints or conditional writes to ensure a notification record is created exactly once per idempotency key
Handle the race where two workers process the same message by checking delivery status before sending to external channels

3. Third-Party Channel Integration and Failure Handling

External SMS, email, and push providers have varying SLAs, rate limits, and failure modes. A single provider outage should not block all deliveries. Interviewers want resilience patterns and observability into vendor health.

Hints to consider:

Implement circuit breakers per vendor to fail fast and route traffic to backup providers when error rates spike
Use per-vendor rate limiters with token buckets to respect API quotas and avoid 429 errors that trigger penalties
Design a webhook retry system with exponential backoff for customer endpoints, capping retries to avoid infinite loops
Track per-channel success rates and latency metrics to automatically deprioritize slow or failing providers

4. User Preferences and Compliance

Users must be able to opt out of categories, mute notifications during sleep hours, and have their data deleted. GDPR and TCPA laws penalize non-compliance. The system must enforce rules consistently at high throughput.

Hints to consider:

Cache user preferences in memory or Redis with short TTLs to avoid database lookups on every notification check
Evaluate quiet hours and frequency caps before enqueuing to reduce wasted processing on messages that will be dropped
Store opt-out lists in low-latency storage and propagate updates within seconds to all notification workers
Design a preference service API that workers query in batch to amortize lookup costs across large fan-out operations

Practice/Meta/Design a Rate Limiter

Design a Rate Limiter

System DesignMust

Problem Statement

Key Requirements

Functional

Multi-channel delivery -- System must route notifications to push, SMS, email, and webhook endpoints based on user preferences and message type
Delivery tracking -- Provide real-time status updates (sent, delivered, failed, opened, clicked) and aggregate engagement analytics for senders
User preference management -- Honor opt-out settings, channel preferences, quiet hours, and notification frequency caps per user
Template and personalization -- Support message templates with variable substitution, localization, and dynamic content rendering
Priority and scheduling -- Allow urgent messages to bypass queues while supporting scheduled delivery and batching for non-critical notifications

Non-Functional

Scalability -- Handle 10 billion notifications per day with peaks of 500,000 messages per second during high-traffic events
Reliability -- Achieve 99.95% delivery success rate with automatic retries, circuit breaking for failing channels, and fallback channel routing
Latency -- Deliver P95 of high-priority notifications within 2 seconds from API submission to channel handoff; P99 under 5 seconds
Consistency -- Guarantee at-least-once delivery semantics with idempotency keys to prevent duplicate sends; eventually consistent delivery status tracking

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Message Queuing and Fan-Out Architecture

Hints to consider:

Use dedicated queues per channel type to isolate slow SMS providers from fast push notifications
Implement priority queues with separate consumer pools to ensure urgent alerts bypass bulk marketing messages
Consider partitioning by user ID or tenant ID to maintain per-user ordering while enabling horizontal scaling
Design dead letter queues with exponential backoff and manual retry capabilities for persistent failures

2. Idempotency and Exactly-Once Processing

Hints to consider:

Accept client-provided idempotency keys and store them with TTL in Redis or DynamoDB to detect duplicates within a time window
Make notification IDs deterministic based on content hash plus recipient to catch duplicates from different sources
Use database constraints or conditional writes to ensure a notification record is created exactly once per idempotency key
Handle the race where two workers process the same message by checking delivery status before sending to external channels

3. Third-Party Channel Integration and Failure Handling

Hints to consider:

Implement circuit breakers per vendor to fail fast and route traffic to backup providers when error rates spike
Use per-vendor rate limiters with token buckets to respect API quotas and avoid 429 errors that trigger penalties
Design a webhook retry system with exponential backoff for customer endpoints, capping retries to avoid infinite loops
Track per-channel success rates and latency metrics to automatically deprioritize slow or failing providers

4. User Preferences and Compliance

Hints to consider:

Cache user preferences in memory or Redis with short TTLs to avoid database lookups on every notification check
Evaluate quiet hours and frequency caps before enqueuing to reduce wasted processing on messages that will be dropped
Store opt-out lists in low-latency storage and propagate updates within seconds to all notification workers
Design a preference service API that workers query in batch to amortize lookup costs across large fan-out operations