Design a messaging system like WhatsApp or Meta Messenger that supports real-time 1:1 chat, message delivery and read status tracking, user presence indicators, and seamless multi-device synchronization. Users expect to send a text message and have it appear on the recipient's screen within a fraction of a second, with clear visual feedback showing whether the message was sent, delivered, and read. The platform must also support offline messaging, where messages composed without connectivity are queued locally and transmitted once the network is restored.
At MongoDB scale, the interviewer wants to see how you model conversation data for high write throughput and efficient retrieval, how you route messages through a fleet of WebSocket servers without creating single points of failure, and how you keep multiple devices per user in sync without duplicating or losing messages. You should be prepared to discuss partitioning strategies that keep per-conversation ordering intact, caching layers for recent messages, and fan-out mechanics for delivery acknowledgments.
Based on real interview experiences, these are the areas interviewers probe most deeply:
Interviewers want to understand how you move a message from sender to recipient reliably and in order, even under retries, network failures, and multi-device scenarios. This tests your grasp of distributed delivery semantics and idempotency.
Hints to consider:
Maintaining persistent bidirectional connections for hundreds of millions of concurrent users is a defining challenge. Interviewers probe how you distribute connections, handle failover, and route messages to the correct gateway.
Hints to consider:
When a user operates multiple devices simultaneously, all must converge on the same conversation state. Interviewers look for explicit per-device cursors, sync protocols, and deterministic conflict handling.
Hints to consider:
Storing and querying billions of messages efficiently requires careful schema choices. Interviewers evaluate your ability to pick the right storage engine, partition effectively, and avoid hot partitions.
Hints to consider:
Presence generates enormous write volume (every online/offline transition, every heartbeat) but tolerates relaxed consistency. Interviewers want to see how you optimize this high-frequency signal without overwhelming storage.
Hints to consider:
Confirm the scope with the interviewer. Ask whether group messaging is in scope or only 1:1 conversations. Clarify the expected scale: daily active users, messages per day, and peak-to-average traffic ratio. Determine whether multimedia (images, voice) is required or text-only. Ask about end-to-end encryption implications on delivery tracking. Establish latency SLAs for message delivery and presence. Confirm whether message search or history retention policies are in scope.
Sketch the core components: client apps (mobile, web, desktop), an API gateway for authentication and HTTP endpoints, a fleet of WebSocket gateway servers for persistent connections, a message ingestion service, Kafka partitioned by conversation ID for durable ordered delivery, delivery worker consumers, a conversation store (Cassandra or sharded MongoDB), Redis for connection routing and recent message caching, and a separate presence service. Trace the message flow end-to-end: sender posts message, ingestion service validates and writes to Kafka, delivery worker reads from the partition, looks up recipient gateway in Redis, pushes via WebSocket, and stores the message in the conversation store. Show the acknowledgment flow back to the sender.
Walk through the critical path in detail. The sender generates a UUID idempotency key and includes the last-known sequence number. The ingestion service validates the payload, assigns a server-side sequence number, writes to the Kafka partition keyed by conversation ID, and returns a "sent" acknowledgment to the sender. A delivery worker consumes from the partition, queries Redis for the recipient's connected gateway servers, and pushes the message to each device. When the recipient's device receives the message, it sends a "delivered" acknowledgment back through its gateway, which the delivery worker propagates to the sender. Explain how retries use the idempotency key to prevent duplicates, and how the sequence number lets reconnecting devices request exactly the messages they missed.
Cover multi-device sync: each device maintains a watermark of the last received sequence number; on reconnection it requests a catch-up batch from the conversation store. Explain read receipt handling: the client sends a "read" event referencing the latest read sequence number, which updates the per-device cursor and fans out to the sender. Discuss presence: maintain an in-memory user-status map in the presence service, persist transitions to Redis, and fan out only to subscribed contacts. Address offline handling: the client queues messages locally and retries with exponential backoff on reconnection, deduplicating via idempotency keys. Mention monitoring: track delivery latency percentiles, Kafka consumer lag, WebSocket connection churn, and storage hot partition detection.
Deepen your understanding of the patterns used in this problem: