Design Messenger/Chat Application — Lyft

Reference Answer

For a full example answer with detailed architecture diagrams and deep dives, see our Slack guide.

Problem Statement

Design a messaging platform similar to WhatsApp or Meta Messenger that enables users to send and receive text messages in real time. The system should support one-on-one conversations with delivery and read receipts, presence indicators showing when contacts are online, and seamless synchronization of conversation history across multiple devices such as phones, tablets, and desktops.

At Lyft's scale, this question probes your ability to architect a system handling billions of persistent connections and tens of billions of daily messages. The interviewer expects you to reason about how messages flow from sender to recipient through distributed infrastructure, how you maintain ordering guarantees in the face of network failures and retries, and how you keep latency under 200ms for online-to-online message delivery.

Beyond the happy path, consider how users experience the system when connectivity is intermittent. Messages queued offline must send automatically upon reconnection, and devices joining a conversation mid-stream need an efficient catch-up mechanism that avoids re-downloading the entire history.

Key Requirements

Functional

One-to-one messaging -- Users can send text messages to any contact with real-time delivery when both parties are online
Delivery and read receipts -- The system tracks three message states: sent (accepted by server), delivered (reached recipient device), and read (opened by recipient)
Multi-device synchronization -- Conversations and message states remain consistent across all of a user's active devices
Offline messaging -- Users can compose and queue messages while disconnected; messages send automatically when connectivity resumes
Presence indicators -- Display whether contacts are currently online or show their last-seen timestamp

Non-Functional

Scalability -- Support 2 billion monthly active users exchanging 100 billion messages per day with 3x peak-to-average traffic ratio
Latency -- Deliver messages end-to-end in under 200ms for online users; presence updates propagate within 500ms
Reliability -- Guarantee at-least-once delivery with no message loss; tolerate datacenter failures and network partitions
Consistency -- Maintain strict message ordering within each conversation; eventual consistency is acceptable for presence and read receipts

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Message Delivery Pipeline and Ordering

Interviewers expect you to walk through the full lifecycle of a message from sender to recipient and explain how you guarantee exactly-once delivery semantics from the user's perspective, even when network retries and infrastructure failures occur.

Hints to consider:

Assign monotonically increasing sequence numbers per conversation to establish total ordering
Use client-generated idempotency keys (UUIDs) to deduplicate retries at the server
Partition message queues by conversation ID so ordering is preserved within a single partition
Implement a two-phase acknowledgment protocol: sender gets confirmation from infrastructure, infrastructure confirms delivery to recipient

2. WebSocket Infrastructure at Scale

Maintaining persistent bidirectional connections for billions of concurrent users is a core challenge. Interviewers want to understand how you distribute connection state, route messages to the correct gateway, and handle reconnections gracefully.

Hints to consider:

Deploy a fleet of WebSocket gateway servers with consistent hashing to distribute connections evenly
Store user-to-gateway routing information in Redis with TTL-based cleanup for stale entries
Use heartbeat protocols to detect dead connections and trigger cleanup within seconds
Decouple message ingestion from delivery using a pub/sub layer like Kafka or Redis Streams

3. Multi-Device Sync and Conflict Resolution

When a user has multiple active devices, all must converge on the same conversation state. Interviewers look for explicit strategies to synchronize message history, read receipts, and typing indicators without race conditions.

Hints to consider:

Maintain per-device read cursors tracking the last seen message ID in the database
Use the maximum read cursor across all devices as the authoritative read-by-user state
Implement a sync protocol where reconnecting devices fetch only messages after their last known sequence number
Handle delivery receipts at the conversation level while tracking read receipts per device

4. Storage Strategy for Conversation History

Efficiently storing and retrieving message history at massive scale requires careful data modeling. Interviewers evaluate your ability to choose the right storage systems and partition data to avoid hot spots.

Hints to consider:

Model conversations as append-only logs keyed by (conversation_id, message_id) in a wide-column store like Cassandra
Use time-based bucketing (monthly or yearly) for archival of older messages
Cache the most recent messages per conversation in Redis with TTL eviction
Implement cursor-based pagination for history retrieval instead of offset-based queries

Suggested Approach

Step 1: Clarify Requirements

Confirm whether the system needs group chats or only one-on-one conversations. Verify scale expectations: daily active users, message volume, and read-to-write ratio. Clarify whether multimedia support (images, videos) is in scope or text-only. Establish latency targets for send, receive, and sync operations. Ask whether end-to-end encryption is required, as it impacts delivery tracking and server-side storage.

Step 2: High-Level Architecture

Sketch client applications connecting through an API gateway to WebSocket gateway servers for persistent connections. Behind the gateways, place a message ingestion service writing to Kafka partitioned by conversation ID. Delivery workers consume from Kafka and push messages to recipient connections via the gateway fleet. Include Cassandra for durable message storage, Redis for connection routing and caching recent messages, and a separate lightweight presence service.

Step 3: Deep Dive — Message Delivery Pipeline

Walk through the complete message flow: sender generates a UUID idempotency key and posts to the ingestion API, which validates and writes to the appropriate Kafka partition. A delivery worker consumes the message, looks up recipient device connections in the Redis routing table, and pushes to each connected device via WebSocket. The message is simultaneously persisted to Cassandra with a composite key of (conversation_id, timestamp). Retries use the idempotency key to prevent duplicates. On reconnection, devices request all messages after their last known sequence number to catch up.

Step 4: Address Secondary Concerns

Discuss presence management: maintain an in-memory map of online users in the presence service, batch status updates every 5-10 seconds, and fan out only to subscribed contacts. Cover offline handling: the client queues messages locally and retries with exponential backoff on reconnection. Address monitoring: track message delivery latency percentiles, WebSocket connection churn rate, Kafka consumer lag, and Cassandra hot partition detection. Explain horizontal scaling: shard WebSocket gateways by user hash, partition Kafka by conversation ID, and leverage Cassandra's built-in sharding.

Related Learning

Slack — Full walkthrough of designing a real-time messaging platform with channels, presence, and notifications
Message Queues — How Kafka and similar systems provide durable, ordered message delivery
Caching — Strategies for caching recent messages and connection routing with Redis
Load Balancers — Distributing WebSocket connections across gateway servers

Reference Answer

For a full example answer with detailed architecture diagrams and deep dives, see our Slack guide.

Problem Statement

Key Requirements

Functional

One-to-one messaging -- Users can send text messages to any contact with real-time delivery when both parties are online
Delivery and read receipts -- The system tracks three message states: sent (accepted by server), delivered (reached recipient device), and read (opened by recipient)
Multi-device synchronization -- Conversations and message states remain consistent across all of a user's active devices
Offline messaging -- Users can compose and queue messages while disconnected; messages send automatically when connectivity resumes
Presence indicators -- Display whether contacts are currently online or show their last-seen timestamp

Non-Functional

Scalability -- Support 2 billion monthly active users exchanging 100 billion messages per day with 3x peak-to-average traffic ratio
Latency -- Deliver messages end-to-end in under 200ms for online users; presence updates propagate within 500ms
Reliability -- Guarantee at-least-once delivery with no message loss; tolerate datacenter failures and network partitions
Consistency -- Maintain strict message ordering within each conversation; eventual consistency is acceptable for presence and read receipts

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Message Delivery Pipeline and Ordering

Hints to consider:

Assign monotonically increasing sequence numbers per conversation to establish total ordering
Use client-generated idempotency keys (UUIDs) to deduplicate retries at the server
Partition message queues by conversation ID so ordering is preserved within a single partition
Implement a two-phase acknowledgment protocol: sender gets confirmation from infrastructure, infrastructure confirms delivery to recipient

2. WebSocket Infrastructure at Scale

Hints to consider:

Deploy a fleet of WebSocket gateway servers with consistent hashing to distribute connections evenly
Store user-to-gateway routing information in Redis with TTL-based cleanup for stale entries
Use heartbeat protocols to detect dead connections and trigger cleanup within seconds
Decouple message ingestion from delivery using a pub/sub layer like Kafka or Redis Streams

3. Multi-Device Sync and Conflict Resolution

Hints to consider:

Maintain per-device read cursors tracking the last seen message ID in the database
Use the maximum read cursor across all devices as the authoritative read-by-user state
Implement a sync protocol where reconnecting devices fetch only messages after their last known sequence number
Handle delivery receipts at the conversation level while tracking read receipts per device

4. Storage Strategy for Conversation History

Hints to consider:

Model conversations as append-only logs keyed by (conversation_id, message_id) in a wide-column store like Cassandra
Use time-based bucketing (monthly or yearly) for archival of older messages
Cache the most recent messages per conversation in Redis with TTL eviction
Implement cursor-based pagination for history retrieval instead of offset-based queries

Suggested Approach

Step 1: Clarify Requirements

Step 2: High-Level Architecture

Step 3: Deep Dive — Message Delivery Pipeline

Step 4: Address Secondary Concerns

Related Learning

Slack — Full walkthrough of designing a real-time messaging platform with channels, presence, and notifications
Message Queues — How Kafka and similar systems provide durable, ordered message delivery
Caching — Strategies for caching recent messages and connection routing with Redis
Load Balancers — Distributing WebSocket connections across gateway servers