[ INFO ]category: System Design difficulty: hard freq: high first seen: 2026-03-24
[HARD][SYSTEM DESIGN][HIGH]System DesignML SystemsRecommendationContent Rankingmachine_learningReal-time InferencebackendContent Health
$catproblem.md
Design the machine learning system that powers Reddit's home feed — the primary surface where users discover and consume content as they scroll. The feed must rank and interleave posts from subscribed subreddits, trending content from communities the user has not yet joined, and advertisements, producing a single, personalized infinite-scroll experience.
Background
Reddit's home feed is the highest-traffic surface on the platform, serving hundreds of millions of feed requests per day. Unlike a purely chronological feed, the modern Reddit home feed uses ML ranking to determine the order in which posts appear. The system must balance multiple competing objectives: showing content the user is most likely to engage with (upvote, comment, share), maintaining a sense of freshness and recency, promoting content diversity across communities, and mixing in discovery content to help users find new subreddits.
Key Components to Discuss
Candidate Sourcing — How do you assemble the pool of posts eligible for the feed (subscribed, trending, discovery, ads)?
Ranking Model — How do you score posts for a given user, and what objectives do you optimize?
Content Mixing & Blending — How do you interleave posts from different sources (subscribed vs. discovery vs. ads) to maximize user satisfaction?
Real-Time Signals — How do you incorporate rapidly changing signals like a post going viral or a user's in-session behavior?
Fairness & Content Health — How do you prevent engagement-bait, misinformation, and echo chambers from dominating the feed?
Discussion Points
How do you handle the tension between showing popular content (high engagement) and niche content from small subreddits the user cares about?
What is the right balance between chronological freshness and predicted relevance?
How would you detect and down-rank low-quality engagement bait (e.g., rage-bait, misleading titles)?
How do you evaluate whether the feed is healthy — not just engaging — over time?