Practice/Google/Design Netflix/Video Streaming Platform

Design Netflix/Video Streaming Platform

System DesignMust

Problem Statement

You are designing a video streaming platform similar to Netflix that allows content creators to upload videos and millions of users to stream them on demand across a wide variety of devices -- smartphones, tablets, smart TVs, laptops, and gaming consoles. The platform must handle the entire lifecycle of a video: upload, transcoding into multiple resolutions and codecs, storage, content delivery via CDN, and adaptive playback that adjusts quality based on the viewer's network conditions.

A core challenge is the video processing pipeline. Raw uploads can be gigabytes in size and must be transcoded into dozens of renditions (combinations of resolution, bitrate, and codec) to support adaptive bitrate streaming protocols like HLS and DASH. This processing is CPU-intensive and must complete within a reasonable time window so content becomes available quickly. Meanwhile, the playback experience must be seamless: viewers expect instant start times, smooth quality transitions when bandwidth fluctuates, and the ability to resume playback on a different device from exactly where they left off.

The platform serves 200 million subscribers globally, with peak concurrent viewership reaching 50 million streams. The content library contains hundreds of thousands of titles, and the system must serve personalized recommendations to keep users engaged.

Key Requirements

Functional

Video upload and processing -- Content creators upload raw video files which are transcoded into multiple resolutions (4K, 1080p, 720p, 480p) and packaged for adaptive bitrate streaming
On-demand playback -- Users browse a catalog, select a title, and begin streaming immediately with adaptive quality adjustment based on available bandwidth
Cross-device resume -- Users can pause on one device and resume from the exact same position on another device
Personalized recommendations -- The homepage displays content tailored to each user's viewing history, preferences, and trending titles

Non-Functional

Scalability -- Support 50M concurrent streams with a catalog of 500K+ titles across multiple regions
Latency -- Video playback starts within 2 seconds of pressing play; quality switches happen without visible buffering
Availability -- 99.99% uptime for the playback path; upload and processing pipelines can tolerate brief delays
Durability -- Zero data loss for uploaded content and user watch history

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Video Transcoding Pipeline

Converting a raw upload into dozens of playback-ready renditions is the most compute-intensive part of the system. Interviewers want to see how you decompose this work, parallelize it across workers, and handle failures.

Hints to consider:

Think about splitting a video into chunks (e.g., 10-second segments) that can be transcoded independently in parallel
Consider how a workflow orchestrator tracks the state of each chunk across multiple resolution outputs
Explore how to handle worker failures mid-transcode without losing progress on completed chunks
Think about priority queues so newly uploaded content from high-profile creators gets processed first

2. Adaptive Bitrate Streaming and CDN Delivery

Delivering smooth playback across diverse network conditions and devices requires careful architecture of the content delivery layer.

Hints to consider:

Understand how HLS/DASH manifests reference multiple quality levels and let the client player switch between them
Think about CDN cache warming strategies for new or trending content to avoid origin overload
Consider how to handle the long tail of the catalog where most titles are rarely watched and may not be cached at edge nodes
Explore origin shielding to prevent cache stampedes when many edge nodes simultaneously request the same segment

3. Cross-Device State Synchronization

Users expect to pick up exactly where they left off, even when switching between devices. This requires managing per-user playback state at massive scale.

Hints to consider:

Think about the write frequency -- updating position on every second of playback creates enormous write volume
Consider batching position updates on the client and flushing periodically (e.g., every 30 seconds) to reduce backend load
Explore using a fast key-value store for the hot path with async persistence to a durable database
Think about conflict resolution when two devices report different positions for the same title

4. Content Recommendation System

Recommendations drive engagement and retention. Interviewers want to see how you balance real-time signals with batch-computed models.

Hints to consider:

Consider a hybrid approach combining offline collaborative filtering with real-time feature updates
Think about how to pre-compute recommendation lists and cache them per user, refreshing periodically
Explore how trending content and new releases get injected into recommendations without waiting for batch cycles
Think about A/B testing infrastructure to measure recommendation quality

Suggested Approach

Step 1: Clarify Requirements

Confirm the expected catalog size, concurrent stream count, and geographic distribution. Ask whether live streaming is in scope or only on-demand content. Clarify the target transcoding latency -- should a new upload be playable within minutes or hours? Confirm which adaptive streaming protocol to support (HLS, DASH, or both). Ask about DRM requirements and whether offline downloads are in scope.

Practice/Google/Design Netflix/Video Streaming Platform

Design Netflix/Video Streaming Platform

System DesignMust

Problem Statement

Key Requirements

Functional

Video upload and processing -- Content creators upload raw video files which are transcoded into multiple resolutions (4K, 1080p, 720p, 480p) and packaged for adaptive bitrate streaming
On-demand playback -- Users browse a catalog, select a title, and begin streaming immediately with adaptive quality adjustment based on available bandwidth
Cross-device resume -- Users can pause on one device and resume from the exact same position on another device
Personalized recommendations -- The homepage displays content tailored to each user's viewing history, preferences, and trending titles

Non-Functional

Scalability -- Support 50M concurrent streams with a catalog of 500K+ titles across multiple regions
Latency -- Video playback starts within 2 seconds of pressing play; quality switches happen without visible buffering
Availability -- 99.99% uptime for the playback path; upload and processing pipelines can tolerate brief delays
Durability -- Zero data loss for uploaded content and user watch history

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Video Transcoding Pipeline

Hints to consider:

Think about splitting a video into chunks (e.g., 10-second segments) that can be transcoded independently in parallel
Consider how a workflow orchestrator tracks the state of each chunk across multiple resolution outputs
Explore how to handle worker failures mid-transcode without losing progress on completed chunks
Think about priority queues so newly uploaded content from high-profile creators gets processed first

2. Adaptive Bitrate Streaming and CDN Delivery

Delivering smooth playback across diverse network conditions and devices requires careful architecture of the content delivery layer.

Hints to consider:

Understand how HLS/DASH manifests reference multiple quality levels and let the client player switch between them
Think about CDN cache warming strategies for new or trending content to avoid origin overload
Consider how to handle the long tail of the catalog where most titles are rarely watched and may not be cached at edge nodes
Explore origin shielding to prevent cache stampedes when many edge nodes simultaneously request the same segment

3. Cross-Device State Synchronization

Users expect to pick up exactly where they left off, even when switching between devices. This requires managing per-user playback state at massive scale.

Hints to consider:

Think about the write frequency -- updating position on every second of playback creates enormous write volume
Consider batching position updates on the client and flushing periodically (e.g., every 30 seconds) to reduce backend load
Explore using a fast key-value store for the hot path with async persistence to a durable database
Think about conflict resolution when two devices report different positions for the same title

4. Content Recommendation System

Recommendations drive engagement and retention. Interviewers want to see how you balance real-time signals with batch-computed models.

Hints to consider:

Consider a hybrid approach combining offline collaborative filtering with real-time feature updates
Think about how to pre-compute recommendation lists and cache them per user, refreshing periodically
Explore how trending content and new releases get injected into recommendations without waiting for batch cycles
Think about A/B testing infrastructure to measure recommendation quality