Practice/Amazon/Design Messenger/Chat Application

Design Messenger/Chat Application

System DesignMust

Problem Statement

Design a messaging system like WhatsApp or Meta Messenger that supports real-time one-on-one text conversations between users. The system must deliver messages instantly, track delivery and read status, show when contacts are online, and keep conversation history synchronized across multiple devices. Users expect their messages to arrive in order, never be lost, and sync seamlessly between phone, tablet, and desktop even with unreliable mobile networks.

At Amazon, interviewers use this question to probe your ability to design low-latency, highly available systems with billions of events, handle at-least-once delivery semantics, and reason about fan-in/fan-out patterns, presence tracking, and multi-device synchronization. Expect to discuss WebSocket connection management, message ordering guarantees, and storage strategies for conversation history at scale.

Key Requirements

Functional

One-to-one messaging -- users can send text messages to any contact with real-time delivery and clear sent/delivered/read status indicators
Multi-device sync -- conversations and message states remain consistent across all of a user's active devices
Offline messaging -- users can compose and queue messages when disconnected; messages send automatically when connectivity resumes
Presence indicators -- display whether contacts are currently online or show their last-seen timestamp

Non-Functional

Scalability -- support 2 billion monthly active users sending 100 billion messages per day with peak traffic 3x average
Reliability -- guarantee at-least-once delivery with no message loss; tolerate datacenter failures and network partitions
Latency -- deliver messages end-to-end in under 200ms for online users; presence updates propagate within 500ms
Consistency -- maintain strict message ordering within each conversation; eventual consistency acceptable for presence and read receipts

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Message Delivery Architecture and Ordering Guarantees

Interviewers want to see how you ensure messages arrive exactly once (from the user's perspective) in the correct order, even with network retries, multiple devices, and distributed infrastructure.

Hints to consider:

Assign monotonically increasing sequence numbers per conversation to establish total ordering
Use client-generated idempotency keys (UUIDs) to deduplicate retries at the server
Partition message queues by conversation ID to preserve ordering guarantees within each chat
Implement acknowledgment protocols where sender confirms receipt from infrastructure, and infrastructure confirms delivery to recipient

2. Real-Time Connection Management and WebSocket Scaling

Maintaining persistent bidirectional connections for billions of concurrent users while routing messages efficiently is a core challenge. Interviewers probe how you distribute connection state, handle reconnections, and avoid single points of failure.

Hints to consider:

Deploy a fleet of stateful WebSocket gateway servers with consistent hashing to distribute connections
Store user-to-gateway routing information in Redis with TTL-based cleanup for detecting dead connections
Implement heartbeat protocols and graceful reconnection with sequence-number-based catch-up for missed messages
Use a pub/sub layer (Kafka or Redis Streams) to decouple message ingestion from connection fan-out

3. Multi-Device Synchronization

When a user has multiple active devices, all must show the same conversation state. Interviewers look for explicit strategies to sync message history and read receipts without race conditions.

Hints to consider:

Maintain per-device read cursors (last seen message sequence number) in the database to track individual device state
Use the maximum read cursor across all devices as the authoritative "read-by-user" timestamp
Implement a sync protocol where devices fetch missing messages based on their last known sequence number on reconnect
Handle delivery receipts at the conversation level while tracking read receipts per device

4. Storage Strategy for Conversation History

Storing and retrieving message history at scale requires careful data modeling. Interviewers evaluate your ability to choose appropriate storage systems and handle hot conversations.

Hints to consider:

Model conversations as append-only logs with composite keys (conversation_id, timestamp/sequence) for efficient range scans
Partition by conversation ID and use time-based bucketing for archival of older messages
Cache recent conversation windows (last 50-100 messages) in Redis with TTL eviction for fast loading
Implement cursor-based pagination for history retrieval rather than offset-based queries

Practice/Amazon/Design Messenger/Chat Application

Design Messenger/Chat Application

System DesignMust

Problem Statement

Key Requirements

Functional

One-to-one messaging -- users can send text messages to any contact with real-time delivery and clear sent/delivered/read status indicators
Multi-device sync -- conversations and message states remain consistent across all of a user's active devices
Offline messaging -- users can compose and queue messages when disconnected; messages send automatically when connectivity resumes
Presence indicators -- display whether contacts are currently online or show their last-seen timestamp

Non-Functional

Scalability -- support 2 billion monthly active users sending 100 billion messages per day with peak traffic 3x average
Reliability -- guarantee at-least-once delivery with no message loss; tolerate datacenter failures and network partitions
Latency -- deliver messages end-to-end in under 200ms for online users; presence updates propagate within 500ms
Consistency -- maintain strict message ordering within each conversation; eventual consistency acceptable for presence and read receipts

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Message Delivery Architecture and Ordering Guarantees

Interviewers want to see how you ensure messages arrive exactly once (from the user's perspective) in the correct order, even with network retries, multiple devices, and distributed infrastructure.

Hints to consider:

Assign monotonically increasing sequence numbers per conversation to establish total ordering
Use client-generated idempotency keys (UUIDs) to deduplicate retries at the server
Partition message queues by conversation ID to preserve ordering guarantees within each chat
Implement acknowledgment protocols where sender confirms receipt from infrastructure, and infrastructure confirms delivery to recipient

2. Real-Time Connection Management and WebSocket Scaling

Hints to consider:

Deploy a fleet of stateful WebSocket gateway servers with consistent hashing to distribute connections
Store user-to-gateway routing information in Redis with TTL-based cleanup for detecting dead connections
Implement heartbeat protocols and graceful reconnection with sequence-number-based catch-up for missed messages
Use a pub/sub layer (Kafka or Redis Streams) to decouple message ingestion from connection fan-out

3. Multi-Device Synchronization

When a user has multiple active devices, all must show the same conversation state. Interviewers look for explicit strategies to sync message history and read receipts without race conditions.

Hints to consider:

Maintain per-device read cursors (last seen message sequence number) in the database to track individual device state
Use the maximum read cursor across all devices as the authoritative "read-by-user" timestamp
Implement a sync protocol where devices fetch missing messages based on their last known sequence number on reconnect
Handle delivery receipts at the conversation level while tracking read receipts per device

4. Storage Strategy for Conversation History

Storing and retrieving message history at scale requires careful data modeling. Interviewers evaluate your ability to choose appropriate storage systems and handle hot conversations.

Hints to consider:

Model conversations as append-only logs with composite keys (conversation_id, timestamp/sequence) for efficient range scans
Partition by conversation ID and use time-based bucketing for archival of older messages
Cache recent conversation windows (last 50-100 messages) in Redis with TTL eviction for fast loading
Implement cursor-based pagination for history retrieval rather than offset-based queries