Practice/Rippling/Design a Social Media Feed Generation System
Design a Social Media Feed Generation System
System DesignMust
Problem Statement
Design the feed generation engine behind a social media platform like Twitter, Instagram, or Facebook. The system collects posts from accounts and topics a user follows, ranks them by relevance and recency, and serves a personalized, paginated timeline that feels fresh and responsive. Users open the app, scroll through content from their network, and expect new posts to appear with minimal delay after publication.
The technical challenge centers on the tension between freshness and cost at scale. When a popular account with millions of followers publishes a post, your system must propagate that content to follower timelines without creating a write storm that overwhelms your infrastructure. You need to choose and defend a fanout strategy, design caching layers that keep the head of the feed fast, implement ranking that balances relevance with chronological ordering, and enforce privacy and content rules across every feed view -- all while maintaining sub-100ms read latency under heavy concurrent load.
Key Requirements
Functional
- Home feed -- Users view a personalized, paginated feed that loads quickly and stays fresh as new posts arrive
- Follow graph -- Users follow and unfollow accounts, and those changes reflect in their feed within seconds
- Post publishing -- Users publish text and media posts that propagate to follower feeds with low latency
- Content ordering -- Feed applies ranking that balances recency with engagement signals, and filters out blocked users and private content the viewer cannot access
Non-Functional
- Scalability -- Support 500M daily active users, 1M new posts per minute, and 10B feed reads per day
- Reliability -- 99.95% uptime for feed reads; no permanent data loss for published posts; graceful degradation during traffic spikes
- Latency -- Feed page load under 100ms at p95; new posts visible in follower feeds within 5 seconds for 99% of cases
- Consistency -- Eventual consistency is acceptable for feed ordering, but privacy enforcement (blocks, private accounts) must be strongly consistent
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Fanout Strategy and Hot-Key Mitigation
The defining architectural choice is how posts reach follower timelines. Interviewers want you to articulate the tradeoffs between fanout-on-write and fanout-on-read, and how you handle celebrity accounts that create extreme write amplification.
Hints to consider:
- Fanout-on-write precomputes timelines by appending each new post to every follower's feed list at write time, giving fast reads but expensive writes for high-follower accounts
- Fanout-on-read assembles the feed at request time by querying recent posts from all followed accounts, avoiding write amplification but requiring fast merge queries
- A hybrid approach uses fanout-on-write for normal accounts (under 100K followers) and fanout-on-read for celebrities, merging both at request time
- Batch and rate-limit fanout for large accounts, processing followers in sharded chunks via async workers to avoid thundering herd effects
2. Caching and Pagination for Feed Reads
Feeds are overwhelmingly read-heavy. Without proper caching, every scroll triggers expensive database queries. Interviewers look for a layered caching strategy that serves the common case from memory.
Hints to consider:
- Store the head of each user's timeline (last 200-500 post IDs) in Redis sorted sets, scored by ranking timestamp
- Use cursor-based pagination with the last-seen post ID or timestamp so clients can resume scrolling without offset drift
- Cache hydrated post objects (author info, media URLs, engagement counts) separately from timeline lists to avoid duplicating data across millions of feeds
- Implement cache warming for users who log in after a dormant period by precomputing their feed asynchronously on session start
3. Ranking and Content Filtering
Users expect a feed that surfaces interesting content, not just the most recent posts. Interviewers probe how you layer ranking on top of candidate generation without adding latency.
Hints to consider:
- Separate candidate generation (collect recent posts from followed accounts) from ranking (score and reorder candidates) into two stages
- Use lightweight features for ranking: post age, author engagement rate, media type, and whether the viewer has interacted with the author recently
- Apply hard filters first (blocked users, private posts without follow relationship, muted keywords) before scoring to reduce the candidate set
- Cache ranking model outputs for popular posts since their scores change slowly relative to request volume
4. Privacy and Content Lifecycle Enforcement
Blocks, private accounts, deleted posts, and edited content must be reflected consistently across all feed views. Interviewers probe how you prevent stale or unauthorized content from appearing.
Hints to consider:
- Maintain a block/mute graph in a fast lookup store (Redis or in-memory) and filter at feed read time rather than trying to remove entries from precomputed timelines
- Propagate post deletions as tombstone events through the same fanout pipeline, with TTL-based cleanup of the tombstone records
- For private account transitions, invalidate cached timelines of non-followers rather than rewriting every historical feed entry
- Use idempotent pipelines so that replayed events (e.g., during recovery) do not resurface deleted content
5. Handling Write Spikes and Backpressure
Major events (sports finals, breaking news, product launches) cause correlated posting spikes that can overwhelm the fanout and storage layers. Interviewers want to see resilience engineering.
Hints to consider:
- Buffer new posts in Kafka and process fanout asynchronously, allowing the publishing path to return success without waiting for full propagation
- Implement backpressure by monitoring consumer lag and temporarily throttling fanout for non-time-sensitive posts (e.g., old reshares) during spikes
- Use circuit breakers on downstream services (cache, database) so that a slow dependency does not cascade into publishing failures
- Autoscale fanout workers based on Kafka partition lag metrics, with pre-provisioned capacity for predictable events like holidays