Practice/MongoDB/Design Netflix/Video Streaming Platform
Design Netflix/Video Streaming Platform
System DesignMust
Problem Statement
Design a video streaming platform like Netflix that supports video playback across multiple devices (mobile, web, smart TVs) with seamless resume functionality, subscription management, personalized recommendations, and content upload capabilities. Users expect to press play and see video within two seconds, switch between devices without losing their place, and browse a homepage tailored to their viewing history.
The system ingests raw video files from content partners, transcodes them into dozens of resolution and bitrate variants for adaptive streaming, and distributes segments through a global CDN. Behind the browsing experience sits a recommendation engine that ranks thousands of titles per user, updated frequently based on watch behavior. At MongoDB interview scale, the interviewer reported focusing heavily on how large movie files are stored and processed rather than on the streaming delivery path alone, so be prepared to go deep on the upload, transcoding, and storage architecture in addition to the playback pipeline.
Key Requirements
Functional
- Video upload and processing -- content partners upload raw video files that are transcoded into multiple resolutions (480p through 4K) and packaged as HLS or DASH segments for adaptive bitrate streaming
- Playback with resume -- users stream video with adaptive quality switching and can pause on one device and resume from the exact timestamp on another
- Personalized homepage -- users see a curated set of recommended titles ranked by their viewing history, preferences, and trending signals
- Subscription gating -- playback access is controlled by subscription tier, with entitlement checks enforced at stream initiation
Non-Functional
- Scalability -- support hundreds of millions of subscribers with tens of millions of concurrent streams during peak evening hours
- Latency -- achieve video start times under 2 seconds and rebuffering rates below 1 percent through effective CDN caching and segment sizing
- Reliability -- maintain 99.9 percent streaming availability with redundant encoding pipelines and multi-region CDN failover
- Cost efficiency -- optimize storage through compression and tiered archival, and minimize bandwidth costs with cache-friendly content distribution
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Video Storage and Processing Pipeline
MongoDB interviewers have specifically emphasized how large movie files are stored, more so than the streaming delivery itself. You need to demonstrate a clear understanding of the full lifecycle from raw upload to playable segments.
Hints to consider:
- Accept uploads via resumable multipart upload to object storage (S3), with each upload tracked by a metadata record in a relational or document database
- Trigger an asynchronous transcoding pipeline through a message queue (Kafka or SQS) that fans out to GPU-accelerated encoding workers producing multiple renditions in parallel
- Store transcoded segments in object storage organized by title ID, quality level, and segment number for cache-friendly, content-addressable URLs
- Implement checkpointing within transcoding jobs so that a worker failure resumes from the last completed segment rather than restarting the entire file
2. Content Delivery and Adaptive Bitrate Streaming
Serving petabytes of video content globally with minimal buffering requires a multi-tier distribution strategy. Interviewers expect you to go beyond "use a CDN" and explain cache hierarchies, segment sizing, and origin shielding.
Hints to consider:
- Design a three-tier distribution: edge CDN nodes serve hot content, regional mid-tier caches absorb misses and shield the origin, and origin servers pull from object storage on cold requests
- Use HLS or DASH with 2-6 second segment durations, balancing startup latency (shorter segments) against HTTP overhead and cache efficiency (longer segments)
- Pre-warm CDN caches for new high-profile releases by pushing segments to edge locations before the title becomes available
- Implement origin shielding so that thousands of edge nodes requesting the same new segment funnel through a single regional cache rather than all hitting the origin simultaneously
3. Watch Progress and Cross-Device Resume
Tracking playback position for every user across every title they watch creates a high-frequency write workload. Interviewers want to see how you handle this without overloading your database or creating hot partitions.
Hints to consider:
- Buffer playback position updates on the client and flush to the server every 10-30 seconds rather than on every frame, dramatically reducing write volume
- Store progress in a key-value or wide-column store (DynamoDB, Cassandra) keyed by (user_id, title_id) for fast point lookups with predictable latency
- Use conditional writes or upserts to ensure that concurrent updates from multiple devices converge on the most recent timestamp
- Cache the active viewing session's progress in Redis with a short TTL so device switches see the latest position without querying the persistent store
4. Recommendation Engine Architecture
Serving personalized recommendations for millions of users requires separating expensive ML computation from the low-latency read path. Interviewers assess whether you understand offline versus online ranking tradeoffs.
Hints to consider:
- Run batch candidate generation offline (hourly or daily) using collaborative filtering or embedding-based models, storing per-user candidate lists in a fast-read store
- Apply a lightweight online re-ranking layer at request time that incorporates real-time signals (time of day, recently watched, trending titles) without running heavy ML inference
- Cache precomputed homepage rows in Redis or a similar store with TTLs, falling back gracefully to a generic trending list if the personalized cache misses
- Separate the recommendation pipeline from the serving path so a model training failure never impacts homepage availability