Microsoft's "Design Top K Songs Service" interview question focuses on building a scalable system to track and serve the top K most-played songs, often in real-time or over specific time windows, incorporating data engineering pipelines, caching, backend services, web APIs, machine learning for ranking, and stream processing.[1]
Design a service that processes song play events (e.g., from a music streaming app like Spotify) and efficiently returns the top K songs by play count. Start with all-time top K under low traffic, then scale to high QPS (e.g., 1M requests/day), handle 10B plays/day across 100M songs, and support time-based windows like last 24 hours or sliding windows of X minutes using tumbling or sliding windows. K is typically 100-1000. Use stream processors (e.g., Kafka Streams, Flink), caching (e.g., Redis), and sharding for scalability.[2][1]
GET /topk?k=10&window=24h®ion=us.No formal I/O formats in sources, but typical API examples include:
Request: GET /api/topk?k=10&window=24h
Response:
[ {"song_id": "song1", "title": "Hit Song A", "plays": 120000}, {"song_id": "song2", "title": "Hit Song B", "plays": 115000}, ... ]
Plays aggregated over window; sorted descending.[2][3]