Practice/Google/Design Netflix/Video Streaming Platform
Design Netflix/Video Streaming Platform
System DesignMust
Problem Statement
You are designing a video streaming platform similar to Netflix that allows content creators to upload videos and millions of users to stream them on demand across a wide variety of devices -- smartphones, tablets, smart TVs, laptops, and gaming consoles. The platform must handle the entire lifecycle of a video: upload, transcoding into multiple resolutions and codecs, storage, content delivery via CDN, and adaptive playback that adjusts quality based on the viewer's network conditions.
A core challenge is the video processing pipeline. Raw uploads can be gigabytes in size and must be transcoded into dozens of renditions (combinations of resolution, bitrate, and codec) to support adaptive bitrate streaming protocols like HLS and DASH. This processing is CPU-intensive and must complete within a reasonable time window so content becomes available quickly. Meanwhile, the playback experience must be seamless: viewers expect instant start times, smooth quality transitions when bandwidth fluctuates, and the ability to resume playback on a different device from exactly where they left off.
The platform serves 200 million subscribers globally, with peak concurrent viewership reaching 50 million streams. The content library contains hundreds of thousands of titles, and the system must serve personalized recommendations to keep users engaged.
Key Requirements
Functional
- Video upload and processing -- Content creators upload raw video files which are transcoded into multiple resolutions (4K, 1080p, 720p, 480p) and packaged for adaptive bitrate streaming
- On-demand playback -- Users browse a catalog, select a title, and begin streaming immediately with adaptive quality adjustment based on available bandwidth
- Cross-device resume -- Users can pause on one device and resume from the exact same position on another device
- Personalized recommendations -- The homepage displays content tailored to each user's viewing history, preferences, and trending titles
Non-Functional
- Scalability -- Support 50M concurrent streams with a catalog of 500K+ titles across multiple regions
- Latency -- Video playback starts within 2 seconds of pressing play; quality switches happen without visible buffering
- Availability -- 99.99% uptime for the playback path; upload and processing pipelines can tolerate brief delays
- Durability -- Zero data loss for uploaded content and user watch history
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Video Transcoding Pipeline
Converting a raw upload into dozens of playback-ready renditions is the most compute-intensive part of the system. Interviewers want to see how you decompose this work, parallelize it across workers, and handle failures.
Hints to consider:
- Think about splitting a video into chunks (e.g., 10-second segments) that can be transcoded independently in parallel
- Consider how a workflow orchestrator tracks the state of each chunk across multiple resolution outputs
- Explore how to handle worker failures mid-transcode without losing progress on completed chunks
- Think about priority queues so newly uploaded content from high-profile creators gets processed first
2. Adaptive Bitrate Streaming and CDN Delivery
Delivering smooth playback across diverse network conditions and devices requires careful architecture of the content delivery layer.
Hints to consider:
- Understand how HLS/DASH manifests reference multiple quality levels and let the client player switch between them
- Think about CDN cache warming strategies for new or trending content to avoid origin overload
- Consider how to handle the long tail of the catalog where most titles are rarely watched and may not be cached at edge nodes
- Explore origin shielding to prevent cache stampedes when many edge nodes simultaneously request the same segment
3. Cross-Device State Synchronization
Users expect to pick up exactly where they left off, even when switching between devices. This requires managing per-user playback state at massive scale.
Hints to consider:
- Think about the write frequency -- updating position on every second of playback creates enormous write volume
- Consider batching position updates on the client and flushing periodically (e.g., every 30 seconds) to reduce backend load
- Explore using a fast key-value store for the hot path with async persistence to a durable database
- Think about conflict resolution when two devices report different positions for the same title
4. Content Recommendation System
Recommendations drive engagement and retention. Interviewers want to see how you balance real-time signals with batch-computed models.
Hints to consider:
- Consider a hybrid approach combining offline collaborative filtering with real-time feature updates
- Think about how to pre-compute recommendation lists and cache them per user, refreshing periodically
- Explore how trending content and new releases get injected into recommendations without waiting for batch cycles
- Think about A/B testing infrastructure to measure recommendation quality
Suggested Approach
Step 1: Clarify Requirements
Confirm the expected catalog size, concurrent stream count, and geographic distribution. Ask whether live streaming is in scope or only on-demand content. Clarify the target transcoding latency -- should a new upload be playable within minutes or hours? Confirm which adaptive streaming protocol to support (HLS, DASH, or both). Ask about DRM requirements and whether offline downloads are in scope.