Practice/Apple/Design a Social Media Feed Generation System
Design a Social Media Feed Generation System
System DesignMust
Problem Statement
Design the system behind a social media home feed — like Twitter, Instagram, or Facebook. When a user opens the app, they see a personalized timeline of posts from people they follow, ranked by relevance, with new content appearing in near real time.
The challenge is deciding how and when to assemble each user's feed. Do you pre-compute feeds when posts are published (fanout-on-write), or assemble them on demand when the user opens the app (fanout-on-read)? Each approach has trade-offs around latency, cost, and freshness — especially when a celebrity with 10 million followers posts.
Key Requirements
Functional
- Home feed -- users see a personalized, paginated feed of posts from accounts they follow
- Post publishing -- users create posts (text, images, video) that appear in their followers' feeds with low latency
- Follow/unfollow -- users manage their follow list, which shapes their feed content
- Ranking -- posts are ordered by relevance (recency, engagement, user preferences), not just chronologically
Non-Functional
- Scalability -- support hundreds of millions of users, with some accounts having millions of followers
- Latency -- the feed loads in under 300ms; new posts from followed accounts appear within a few seconds
- Availability -- the feed must always load, even if some backend services are degraded
- Consistency -- eventual consistency is acceptable; a post appearing a few seconds late in some feeds is fine
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Fanout Strategy
The fundamental design decision: when a post is published, do you push it to every follower's feed (fanout-on-write), or do you pull from followed accounts when the user requests their feed (fanout-on-read)?
Hints to consider:
- Fanout-on-write: pre-compute each user's feed timeline by writing the post to every follower's inbox → fast reads, expensive writes
- Fanout-on-read: at read time, fetch recent posts from all followed accounts and merge → cheap writes, expensive reads
- Hybrid approach: use fanout-on-write for normal users, fanout-on-read for celebrity accounts (millions of followers)
- Discuss the storage cost: fanout-on-write duplicates every post N times (where N = follower count)
2. Handling Celebrity Accounts (Hot Keys)
A celebrity posting triggers fanout to millions of timelines. Naive fanout-on-write will overwhelm your workers and storage.
Hints to consider:
- Mark accounts above a follower threshold as "celebrity" and exclude them from fanout-on-write
- When a user requests their feed, merge their pre-computed timeline with a live fetch of celebrity posts
- Use batching and rate limiting on the fanout pipeline to prevent worker overload
- Discuss how to detect and adapt to sudden follower count changes (account going viral)
3. Feed Ranking and Personalization
A purely chronological feed is simple but not engaging. Users expect relevant content surfaced above purely recent posts.
Hints to consider:
- Score each candidate post using signals: recency, author engagement history with this user, post engagement (likes, comments), content type preference
- Run ranking as a lightweight model at read time over the candidate set (top ~500 posts from the last few days)
- Cache ranked feeds with short TTLs and invalidate on new activity
- Discuss the cold start problem: new users have no interaction history for personalization