Practice/Meta/Design a Trending Hashtags System
Design a Trending Hashtags System
System DesignMust
Problem Statement
Design a system that delivers real-time commentary and statistics for live sporting events to millions of concurrent viewers across web and mobile platforms. The system must handle commentary updates from professional commentators, integrate with data feeds providing match statistics (scores, player actions, possession percentages), and deliver these updates to viewers with minimal delay. During peak moments like goals or critical plays, usage spikes dramatically as millions of users refresh simultaneously or receive push notifications.
Your system needs to support multiple ongoing matches across different sports (soccer, basketball, cricket), allow viewers to customize their experience (favorite team alerts, specific stat tracking), and maintain a continuous stream of updates even as commentators type rapidly during high-action sequences. The challenge lies in managing variable write patterns from commentators, coordinating multiple data sources, handling massive read fan-out during exciting moments, and ensuring viewers never miss critical updates even when network conditions fluctuate.
Key Requirements
Functional
- Live commentary delivery -- professional commentators must be able to post text updates, which appear to viewers within 1-2 seconds
- Statistics integration -- match statistics from official data feeds (scores, player stats, possession) must be merged with commentary in a unified timeline
- Match timeline -- viewers can scroll through the complete history of a match, seeing all commentary and events in chronological order
- Personalized alerts -- users can subscribe to specific teams or match types and receive push notifications for critical moments (goals, match start/end)
- Multi-match support -- the system must handle hundreds of simultaneous live matches during peak sporting weekends
Non-Functional
- Scalability -- support 10 million concurrent viewers across all active matches, with individual popular matches reaching 2-3 million viewers
- Reliability -- maintain 99.95% uptime during live matches, with graceful degradation rather than complete failure
- Latency -- deliver commentary updates to viewers within 2 seconds of commentator submission, and critical events within 1 second
- Consistency -- ensure commentary appears in the correct chronological order for all viewers, even during rapid-fire updates
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Write Path and Commentary Ingestion
Commentators produce bursts of updates during exciting moments, creating spiky write patterns that differ dramatically from typical social media. Interviewers want to see how you handle validation, ordering, and rapid propagation without overwhelming downstream systems.
Hints to consider:
- Consider how you'll buffer and batch commentary updates while maintaining perceived real-time delivery
- Think about how to guarantee ordering when multiple commentators contribute to the same match
- Explore strategies for handling commentator edits or deletions of recent updates
- Discuss how you'll integrate structured data feeds (scores, stats) with unstructured commentary text
2. Fan-Out Architecture for Millions of Concurrent Viewers
During a goal or game-winning play, millions of users simultaneously need the same update, creating a massive read amplification challenge. The interviewer wants to understand your fan-out strategy and how you'll avoid overwhelming individual database nodes.
Hints to consider:
- Evaluate push versus pull models for delivering updates to active viewers
- Consider tiered caching strategies that leverage CDNs, application caches, and client-side buffering
- Think about how WebSocket connections scale and when to switch between transport mechanisms
- Discuss partitioning strategies to distribute load while keeping match data co-located
3. Handling Network Interruptions and Reconnections
Mobile viewers frequently experience network transitions (WiFi to cellular, signal loss, app backgrounding). The system must seamlessly catch them up without duplicate updates or gaps in coverage.
Hints to consider:
- Design a mechanism for clients to track their last received update and request deltas on reconnection
- Consider how to handle clients that were offline for extended periods versus brief disconnections
- Think about conflict-free data structures or sequence numbers that make catching up efficient
- Discuss rate-limiting strategies to prevent reconnection storms from overwhelming the system
4. Multi-Match Coordination and Resource Allocation
During major sporting events, dozens of matches occur simultaneously with wildly varying viewer counts. Resources must be allocated dynamically to avoid over-provisioning for quiet matches while ensuring popular ones don't degrade.
Hints to consider:
- Explore how to partition and shard data so each match can scale independently
- Consider auto-scaling triggers based on viewer count and update frequency
- Think about how to prioritize delivery for high-viewership matches during resource constraints
- Discuss strategies for pre-warming capacity before scheduled major events versus handling unexpected viral matches
Suggested Approach
Step 1: Clarify Requirements
Start by confirming the scale and usage patterns. Ask how many matches run simultaneously during peak periods (hundreds? thousands?), what the largest single match audience looks like (millions?), and whether commentary comes from a single official source per match or multiple contributors. Clarify expectations around delivery latency—is 2 seconds acceptable or do they expect sub-second? Understand whether viewers need full match history on join or just recent updates. Confirm whether the system needs to support video replay clips, images, or purely text and stats. Ask about internationalization—do commentaries happen in multiple languages per match?