Practice/Roblox/Design a Pub/Sub System
Design a Pub/Sub System
System DesignMust
Problem Statement
Design a publish-subscribe messaging system where publishers send messages to named topics and subscribers receive messages from topics they have subscribed to. The system must support many-to-many relationships between publishers and subscribers, deliver messages reliably, and scale horizontally as topic counts and message volumes grow. Think of systems like Apache Kafka, Google Cloud Pub/Sub, or AWS SNS/SQS that power event-driven architectures across the industry.
The core challenge is building a messaging backbone that delivers messages with low latency and high throughput while providing configurable delivery guarantees. Publishers should not need to know who the subscribers are, and subscribers should not need to coordinate with each other. The system must handle subscriber failures gracefully (retrying delivery or buffering messages), support multiple delivery semantics (at-least-once, at-most-once), and partition topics for horizontal scaling. At Roblox scale, such a system could power game event broadcasting, player activity feeds, real-time analytics pipelines, and inter-service communication.
Interviewers use this question to evaluate your understanding of distributed messaging fundamentals: message ordering guarantees, consumer group coordination, backpressure handling, and the trade-offs between throughput, latency, and durability.
Key Requirements
Functional
- Topic management -- create, delete, and list topics; topics are named channels that decouple publishers from subscribers
- Publish messages -- publishers send messages to a topic; messages include a payload and optional metadata (headers, partition key)
- Subscribe and consume -- subscribers register interest in topics and receive messages either via push (webhook/callback) or pull (polling) mechanisms
- Consumer groups -- multiple subscribers can form a group where each message is delivered to exactly one member of the group, enabling parallel processing
Non-Functional
- Scalability -- handle 1 million messages per second across 100,000 topics with 500,000 active subscriptions
- Latency -- end-to-end message delivery within 100ms at p95 for push subscribers; publish acknowledgment within 50ms
- Reliability -- at-least-once delivery guarantee by default; messages are persisted durably before acknowledgment; no messages lost during broker failures
- Ordering -- messages within a single partition are delivered in the order they were published; no global ordering guarantee across partitions
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Message Storage and Partitioning
How you store and partition messages determines throughput, ordering guarantees, and scalability. Interviewers expect you to reason about log-based storage and partition assignment strategies.
Hints to consider:
- Use an append-only log per partition (similar to Kafka's segment files) for sequential writes that maximize disk throughput
- Partition topics by a key (e.g., hash of message key) to enable parallel consumption while maintaining per-partition ordering
- Determine the number of partitions per topic at creation time, balancing parallelism against the overhead of managing many partitions
- Store messages with offsets (monotonically increasing sequence numbers) so consumers can track their position and replay from any point
2. Consumer Group Coordination
When multiple consumers in a group process messages from a topic, you need a protocol for assigning partitions to consumers and rebalancing when consumers join or leave.
Hints to consider:
- Assign each partition to exactly one consumer in a group; a consumer may handle multiple partitions, but a partition is never split across consumers
- Use a coordination service (ZooKeeper or an internal protocol) to detect consumer failures and trigger rebalancing
- Implement cooperative rebalancing where only affected partitions are reassigned, minimizing disruption compared to stop-the-world rebalancing
- Track consumer offsets (the last successfully processed message) per partition per group, stored durably to support consumer restarts
3. Delivery Guarantees and Acknowledgment Protocol
The acknowledgment protocol between brokers and consumers determines whether you achieve at-most-once, at-least-once, or exactly-once delivery. Interviewers probe the trade-offs.
Hints to consider:
- At-least-once: the consumer processes the message and then commits the offset; if it crashes before committing, the message is redelivered on restart
- At-most-once: the consumer commits the offset before processing; if it crashes during processing, the message is lost
- Exactly-once requires idempotent consumers or transactional offset commits coordinated with processing side effects
- Design a visibility timeout for push-based delivery: if the subscriber does not acknowledge within a time window, the message is redelivered to another consumer
4. Backpressure and Flow Control
When consumers cannot keep up with publishers, unbounded message buffering leads to memory exhaustion and cascading failures. Interviewers want to see your strategy for managing flow.
Hints to consider:
- Implement per-partition retention policies (time-based or size-based) so old messages are automatically purged when consumers fall too far behind
- Use pull-based consumption where consumers control their fetch rate, naturally applying backpressure without server-side buffering
- For push-based subscribers, implement rate limiting and exponential backoff when delivery attempts fail or are throttled
- Monitor consumer lag (the gap between the latest published offset and the consumer's committed offset) and alert when it exceeds thresholds
Suggested Approach
Step 1: Clarify Requirements
Confirm the expected message throughput and topic count. Ask whether messages need to be retained after consumption (for replay) or can be discarded once delivered. Clarify the delivery model: push (server calls subscriber endpoint) or pull (consumer polls for messages), or both. Ask about message size limits and whether the system should support large payloads or just metadata with references. Clarify ordering requirements: is per-partition ordering sufficient, or do some use cases require global ordering? Ask about multi-tenancy: should the system support multiple isolated namespaces with separate quotas? Finally, confirm the durability requirement: must every acknowledged message survive a broker failure?