Practice/Oracle/Design Spotify
Design Spotify
System DesignMust
Problem Statement
Design a music streaming platform that allows users to upload songs, create albums, and stream music content. Users browse a massive catalog, search for artists and tracks, and stream songs on demand across devices with low startup latency and uninterrupted playback.
Interviewers use this question to test how you handle large media blobs, read-heavy traffic at global scale, and multi-step ingestion pipelines (upload, transcode, index). They want to see clear separation of concerns (metadata vs media delivery), sensible storage choices, CDN usage for latency, and event-driven design for analytics and search. Expect to reason about tradeoffs between consistency, cost, and performance while keeping the core user flows simple and robust.
Key Requirements
Functional
- Song upload -- users upload songs with metadata (title, artist, artwork) and manage ownership
- Album management -- users create and manage albums by organizing uploaded tracks and publishing them
- Music streaming -- users stream songs with low startup latency and uninterrupted playback across devices
- Search and browse -- users search and browse the catalog by artist, album, track, and genre
Non-Functional
- Scalability -- support hundreds of millions of tracks and billions of daily streams across a global user base
- Reliability -- ensure uploaded content is never lost; streaming should degrade gracefully (lower quality) rather than fail
- Latency -- start playback within 2 seconds; search results in under 200ms
- Consistency -- eventual consistency for catalog updates; strong consistency for user library and playlist modifications
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Media Storage and Delivery
Audio files are large binary objects that need efficient storage and global delivery. Interviewers expect a clear blob handling strategy.
Hints to consider:
- Store audio files in object storage (S3) and deliver via CDN with edge caching for popular tracks
- Transcode uploaded audio into multiple bitrates and formats for adaptive bitrate streaming
- Use segmented streaming (HLS/DASH) so clients can start playback quickly and adjust quality based on network conditions
- Generate pre-signed URLs for CDN access to control authorization without routing through application servers
2. Ingestion Pipeline
Uploading and processing a track involves multiple steps that should not block the user. Interviewers expect an asynchronous, fault-tolerant pipeline.
Hints to consider:
- Use a durable workflow: upload to S3, trigger transcoding workers via message queue, update catalog on completion
- Support resumable uploads for large files over unreliable networks
- Process artwork (resize, compress) and generate waveforms in parallel with audio transcoding
- Track ingestion progress with status updates (uploading, processing, available) visible to the uploader
3. Catalog Search and Discovery
Search is a critical user-facing feature. Interviewers probe how you make a large catalog discoverable with fast, relevant results.
Hints to consider:
- Use Elasticsearch with analyzers, synonyms, and autocomplete for fast, typo-tolerant search
- Index denormalized catalog documents (track, artist, album) with popularity signals for relevance boosting
- Implement faceted search for filtering by genre, release year, duration, and mood
- Keep the search index updated via change data capture from the catalog database
4. Streaming Architecture and Playback Experience
Interviewers want to see how you ensure smooth playback across variable network conditions and device types.
Hints to consider:
- Implement adaptive bitrate streaming: client starts with a low-quality segment and upgrades as bandwidth is measured
- Pre-fetch the next track in a playlist to enable gapless playback transitions
- Cache recently and frequently played tracks at CDN edge locations; use analytics to predict and pre-warm cache for trending content
- Handle offline mode by allowing clients to download encrypted tracks for local playback with DRM enforcement