Practice/Meta/Design Top K Songs Widget for Spotify
Design Top K Songs Widget for Spotify
System DesignMust
Problem Statement
Design a global leaderboard system for a massively multiplayer online game with 50 million daily active users. Players compete in matches that last 5-30 minutes, and at the end of each match, their scores must be recorded and reflected in multiple leaderboards: global rankings, regional rankings, and friend-group rankings. The system should display top 100 players globally, top 50 in each region, and personalized rankings showing where a player stands relative to nearby competitors.
The core challenge is handling extreme write throughput during peak hours (up to 500,000 score updates per second) while maintaining sub-200ms read latency for leaderboard queries. The system must handle score ties fairly, prevent cheating through duplicate submissions, and gracefully degrade during regional outages without corrupting global state.
Key Requirements
Functional
- Score ingestion -- accept match results with player ID, score, timestamp, and match metadata from game servers
- Multiple leaderboard views -- support global, regional (10+ regions), and friend-group (up to 500 friends per player) rankings
- Personalized ranking context -- show each player their current rank plus 10 players above and below them
- Historical snapshots -- preserve daily and weekly leaderboard states for seasonal competitions and rewards distribution
Non-Functional
- Scalability -- handle 500,000 writes/second during peak hours and 2 million concurrent leaderboard reads
- Reliability -- ensure no score loss even during partial system failures; support exactly-once processing semantics
- Latency -- achieve sub-200ms p99 latency for leaderboard reads and process score updates within 5 seconds
- Consistency -- eventual consistency is acceptable for leaderboard updates; strong consistency required for reward distributions
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Write Path Architecture and Hot Key Management
Interviewers want to see how you handle massive concurrent writes to the same leaderboard entities, especially during synchronized events like daily resets or tournament finals when millions of players finish matches simultaneously.
Hints to consider:
- Consider batching score updates within short time windows to reduce database operations
- Explore sharding strategies that distribute writes for the same leaderboard across multiple nodes
- Discuss using write-behind caching or buffering to smooth out traffic spikes
- Address how to handle popular players whose scores update frequently without creating contention
2. Efficient Top-K Retrieval and Rank Calculation
The system must quickly retrieve top players and calculate arbitrary player positions without scanning entire datasets. Interviewers expect you to reason about data structures and indexing strategies for logarithmic-time operations.
Hints to consider:
- Evaluate sorted set data structures that support both score-based ordering and efficient rank lookups
- Consider pre-computing and materializing top-N results in fast storage layers
- Discuss tradeoffs between exact rankings versus approximate percentile-based positions
- Address how to efficiently compute "players near me" queries without full table scans
3. Handling Score Updates and Duplicate Prevention
Late-arriving events, retried requests, and malicious duplicate submissions can corrupt leaderboards. Interviewers expect robust idempotency and deduplication mechanisms.
Hints to consider:
- Design unique identifiers for matches that enable idempotent score processing
- Consider using distributed locks or conditional writes when updating player records
- Discuss time-window based deduplication strategies that balance memory usage with accuracy
- Address how to handle legitimate score corrections versus replay attacks
4. Multi-Tenancy and Leaderboard Segmentation
Supporting global, regional, and friend-group leaderboards simultaneously creates data modeling and consistency challenges. Interviewers want to see how you partition and aggregate data efficiently.
Hints to consider:
- Explore denormalized designs that maintain separate leaderboard copies versus normalized designs with dynamic aggregation
- Consider hierarchical rollup strategies where regional scores feed into global rankings
- Discuss caching strategies specific to each leaderboard type based on access patterns
- Address how friend-graph data affects leaderboard computation and storage
Suggested Approach
Step 1: Clarify Requirements
Start by confirming the scale parameters and business rules. Ask about the expected number of concurrent players, match completion rates, and whether the game has synchronized events (like hourly tournaments) that cause traffic spikes. Clarify whether partial player information (like "you're in top 5%") is acceptable for extremely large friend groups, or if exact rankings are required. Confirm the score update semantics: can scores decrease, or do they only increase? Verify whether there are different game modes with separate leaderboards, and if leaderboards reset on fixed schedules.