Design Twitter — Asana

Problem Statement

Design a social media platform similar to Twitter where users post short text updates, follow other accounts, and consume a personalized home timeline. The twist in this variant (asked at Asana, Amazon, and others) is that the platform also incorporates product discovery and shopping features: users can attach product links to tweets, browse recommended products in their feed, and complete purchases without leaving the app.

The core engineering challenges are building a low-latency feed that merges social content with product recommendations, handling extreme read amplification from celebrity accounts with millions of followers, orchestrating reliable e-commerce checkout flows alongside social interactions, and maintaining crisp user experience across two very different domains. You will need to reason about fanout strategies, timeline materialization, caching, saga-based payment workflows, and domain boundary separation.

Key Requirements

Functional

Posting -- users create short text updates with optional media attachments and product tags
Following and timeline -- users follow accounts and see a personalized, near-real-time home timeline mixing followed content with recommended products
Search and discovery -- users search for tweets and products, view product details in-app, and browse trending topics
Shopping checkout -- users add products to a cart, complete payment securely, and receive order confirmation and delivery status notifications

Non-Functional

Scalability -- support 500 million daily active users, 500 million new tweets per day, and a read-to-write ratio exceeding 100:1
Reliability -- guarantee no lost tweets, no double charges, and no inventory oversells; tolerate datacenter failures
Latency -- home timeline loads in under 200 ms at p95; tweet posting completes in under 500 ms; checkout confirmation in under two seconds
Consistency -- eventual consistency acceptable for timeline and follower counts; strong consistency required for payment and inventory operations

What Interviewers Focus On

Based on real interview experiences at Asana, Cloudflare, Brex, Amazon, and Snapchat, these are the areas interviewers probe most deeply:

1. Timeline Fanout Strategy

The most common deep-dive area. Celebrity accounts with millions of followers create massive write amplification under fanout-on-write, while fanout-on-read adds latency for every timeline load. Interviewers want to see a hybrid approach with clear reasoning.

Hints to consider:

Use fanout-on-write for regular users: when a user posts, a worker pushes the tweet ID into each follower's precomputed timeline in a cache (Redis sorted set keyed by user ID, scored by timestamp)
For high-follower accounts (celebrities, brands), skip the write fanout; instead, merge their tweets into the timeline at read time by querying a small "celebrity tweets" index
Define a follower-count threshold (for example, 10,000) that triggers the switch from write to read fanout
Precompute timelines for only active users (those who logged in within the last seven days) to avoid wasting resources on dormant accounts

2. Feed Ranking and Product Recommendation Injection

The timeline is not purely chronological -- it blends social posts with product recommendations. Interviewers probe how you merge these two data sources without degrading latency.

Hints to consider:

Fetch the raw timeline (precomputed tweet IDs) and a small batch of recommended product IDs from a separate recommendation service in parallel
Apply a lightweight ranking model at the edge that interleaves social posts and products based on engagement signals, recency, and user affinity
Cache ranked feed pages for short TTLs (30-60 seconds) so repeated scrolls do not re-invoke the ranker
Keep the recommendation service decoupled from the social graph; it consumes engagement events from Kafka and maintains its own feature store

3. E-Commerce Checkout Workflow

Adding shopping to a social platform introduces multi-step transactional workflows that must not interfere with the low-latency social experience. Interviewers expect saga-based orchestration.

Hints to consider:

Model checkout as a saga: reserve inventory, authorize payment, create order, send confirmation; each step has a compensating rollback action
Use idempotency keys on all payment-provider API calls to make retries safe
Store cart state in Redis with TTL-based expiry for abandoned carts; persist confirmed orders in PostgreSQL
Isolate the commerce write path from the social write path so a payment-provider outage does not degrade timeline performance

4. Storage and Caching Architecture

Serving sub-200 ms timelines at massive scale demands aggressive caching and careful data modeling. Interviewers assess your technology choices and cache invalidation strategy.

Hints to consider:

Store tweet metadata in Amazon DynamoDB or Cassandra for high write throughput and predictable read latency, partitioned by tweet ID
Cache precomputed timelines in Redis sorted sets; evict entries older than a configured window (for example, two weeks)
Use a separate Elasticsearch cluster for tweet and product search, updated asynchronously via change data capture from the primary store
Maintain hot counters (likes, retweets, view counts) in Redis and periodically flush to durable storage to avoid write amplification on the primary database

5. Domain Separation Between Social and Commerce

Tightly coupling tweet metadata with product catalog data leads to schema complexity and performance interference. Interviewers want clean boundaries.

Hints to consider:

Treat social (tweets, follows, timelines) and commerce (products, carts, orders, inventory) as separate bounded contexts with independent databases
Join the two at the application layer: a tweet references a product ID, and the client fetches product details from the commerce API when rendering
Use Kafka as the integration backbone: tweet-engagement events flow to the recommendation service; order-completion events flow to the social notification service
Avoid sharing database schemas or tables across the two domains to enable independent scaling and deployment

Suggested Approach

Step 1: Clarify Requirements

Confirm whether the scope is a full Twitter clone with shopping or a simpler microblog. Ask about scale: daily active users, tweets per day, and read-to-write ratio. Clarify the product catalog size and whether inventory is managed by the platform or by third-party sellers. Determine if the recommendation engine is in scope or a black box. Verify latency and consistency expectations for timeline versus checkout.

Step 2: High-Level Architecture

Sketch the main components split across two domains. Social domain: API gateway, tweet ingestion service, fanout workers, Redis timeline cache, DynamoDB for tweet storage, Elasticsearch for search, and a follower graph service. Commerce domain: product catalog service, cart service (Redis-backed), order service, payment gateway integration, and inventory service (PostgreSQL). Shared infrastructure: Kafka event bus connecting the two domains, a recommendation service consuming engagement events, and a notification service for both social alerts and order updates.

Step 3: Deep Dive on Timeline Fanout

Walk through what happens when a user posts a tweet. The tweet is written to DynamoDB and published to Kafka. A fanout worker consumes the event, checks the poster's follower count. If below the threshold, it iterates over followers (fetched from the graph service in batches), and for each active follower pushes the tweet ID into their Redis timeline sorted set. If the poster is a celebrity, the worker skips the fanout. When a user loads their timeline, the API fetches the precomputed list from Redis, then merges in recent tweets from followed celebrities by querying the celebrity tweet index, and finally calls the recommendation service for product insertions. A ranker blends the results and returns a paginated feed.

Step 4: Address Secondary Concerns

Cover checkout: saga with inventory reservation, payment authorization, order creation, and compensation on failure. Discuss search: Elasticsearch indexes updated via change data capture with eventual consistency. Address notifications: Kafka-driven delivery of new-follower, like, and order-status events to mobile push and in-app channels. Mention monitoring: track timeline p95 latency, fanout lag, cache hit rate, payment success rate, and inventory contention. Discuss scaling: add Redis shards for timeline growth, partition DynamoDB by tweet ID, scale fanout workers horizontally, and use read replicas for the commerce database.

Related Learning Resources

Design Slack -- covers real-time message delivery, channel fanout, and WebSocket infrastructure patterns applicable to timeline updates
Design a Payment System -- explores saga-based checkout workflows, idempotency, and payment gateway integration relevant to the commerce side

Problem Statement

Key Requirements

Functional

Posting -- users create short text updates with optional media attachments and product tags
Following and timeline -- users follow accounts and see a personalized, near-real-time home timeline mixing followed content with recommended products
Search and discovery -- users search for tweets and products, view product details in-app, and browse trending topics
Shopping checkout -- users add products to a cart, complete payment securely, and receive order confirmation and delivery status notifications

Non-Functional

Scalability -- support 500 million daily active users, 500 million new tweets per day, and a read-to-write ratio exceeding 100:1
Reliability -- guarantee no lost tweets, no double charges, and no inventory oversells; tolerate datacenter failures
Latency -- home timeline loads in under 200 ms at p95; tweet posting completes in under 500 ms; checkout confirmation in under two seconds
Consistency -- eventual consistency acceptable for timeline and follower counts; strong consistency required for payment and inventory operations

What Interviewers Focus On

Based on real interview experiences at Asana, Cloudflare, Brex, Amazon, and Snapchat, these are the areas interviewers probe most deeply:

1. Timeline Fanout Strategy

Hints to consider:

Use fanout-on-write for regular users: when a user posts, a worker pushes the tweet ID into each follower's precomputed timeline in a cache (Redis sorted set keyed by user ID, scored by timestamp)
For high-follower accounts (celebrities, brands), skip the write fanout; instead, merge their tweets into the timeline at read time by querying a small "celebrity tweets" index
Define a follower-count threshold (for example, 10,000) that triggers the switch from write to read fanout
Precompute timelines for only active users (those who logged in within the last seven days) to avoid wasting resources on dormant accounts

2. Feed Ranking and Product Recommendation Injection

The timeline is not purely chronological -- it blends social posts with product recommendations. Interviewers probe how you merge these two data sources without degrading latency.

Hints to consider:

Fetch the raw timeline (precomputed tweet IDs) and a small batch of recommended product IDs from a separate recommendation service in parallel
Apply a lightweight ranking model at the edge that interleaves social posts and products based on engagement signals, recency, and user affinity
Cache ranked feed pages for short TTLs (30-60 seconds) so repeated scrolls do not re-invoke the ranker
Keep the recommendation service decoupled from the social graph; it consumes engagement events from Kafka and maintains its own feature store

3. E-Commerce Checkout Workflow

Adding shopping to a social platform introduces multi-step transactional workflows that must not interfere with the low-latency social experience. Interviewers expect saga-based orchestration.

Hints to consider:

Model checkout as a saga: reserve inventory, authorize payment, create order, send confirmation; each step has a compensating rollback action
Use idempotency keys on all payment-provider API calls to make retries safe
Store cart state in Redis with TTL-based expiry for abandoned carts; persist confirmed orders in PostgreSQL
Isolate the commerce write path from the social write path so a payment-provider outage does not degrade timeline performance

4. Storage and Caching Architecture

Serving sub-200 ms timelines at massive scale demands aggressive caching and careful data modeling. Interviewers assess your technology choices and cache invalidation strategy.

Hints to consider:

Store tweet metadata in Amazon DynamoDB or Cassandra for high write throughput and predictable read latency, partitioned by tweet ID
Cache precomputed timelines in Redis sorted sets; evict entries older than a configured window (for example, two weeks)
Use a separate Elasticsearch cluster for tweet and product search, updated asynchronously via change data capture from the primary store
Maintain hot counters (likes, retweets, view counts) in Redis and periodically flush to durable storage to avoid write amplification on the primary database

5. Domain Separation Between Social and Commerce

Tightly coupling tweet metadata with product catalog data leads to schema complexity and performance interference. Interviewers want clean boundaries.

Hints to consider:

Treat social (tweets, follows, timelines) and commerce (products, carts, orders, inventory) as separate bounded contexts with independent databases
Join the two at the application layer: a tweet references a product ID, and the client fetches product details from the commerce API when rendering
Use Kafka as the integration backbone: tweet-engagement events flow to the recommendation service; order-completion events flow to the social notification service
Avoid sharing database schemas or tables across the two domains to enable independent scaling and deployment

Suggested Approach

Step 1: Clarify Requirements

Step 2: High-Level Architecture

Step 3: Deep Dive on Timeline Fanout

Step 4: Address Secondary Concerns

Related Learning Resources

Design Slack -- covers real-time message delivery, channel fanout, and WebSocket infrastructure patterns applicable to timeline updates
Design a Payment System -- explores saga-based checkout workflows, idempotency, and payment gateway integration relevant to the commerce side