Practice/Amazon/Design YouTube
Design YouTube
System DesignMust
Problem Statement
Design a video streaming platform like YouTube where creators can upload and share videos, viewers can stream content on demand, and users receive personalized recommendations based on their viewing history. The platform must handle the complete lifecycle from video upload through processing to global delivery at scale.
YouTube is an on-demand video streaming platform with two distinct paths: the creator path (upload, processing, management) and the viewer path (low-latency playback at scale). Interviewers ask this to assess whether you can design for both paths while handling large blobs, asynchronous workflows, and viral read traffic. Expect to discuss uploads, transcoding pipelines (HLS/DASH), CDN strategy, metadata stores, and operational concerns like hotspots and backfills.
Key Requirements
Functional
- Video upload -- creators reliably upload large video files with pause and resume support for interrupted uploads and progress visibility
- On-demand playback -- viewers watch videos with smooth playback, adaptive quality based on bandwidth, and the ability to seek to any position
- Content management -- creators manage metadata (title, description, thumbnail), see processing status, and control visibility (public, unlisted, private)
- Sharing and discovery -- users share video links, browse recommendations, and search for content across the platform
Non-Functional
- Scalability -- support billions of video views per day with millions of uploads, handling viral content that generates massive concurrent viewership
- Reliability -- maintain 99.9% uptime for playback with graceful degradation during partial outages; zero data loss for uploaded content
- Latency -- video playback start under 2 seconds, adaptive bitrate switching within one segment, upload processing completion within minutes
- Consistency -- eventual consistency acceptable for view counts and recommendations; strong consistency for upload state and access controls
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Upload and Transcoding Pipeline
Video files are massive binary objects requiring specialized handling. The upload path must be resilient to network interruptions, and the processing pipeline must transcode videos into multiple formats and resolutions asynchronously.
Hints to consider:
- Use pre-signed URLs for direct-to-object-storage uploads, bypassing application servers to avoid bandwidth bottlenecks
- Implement multipart/chunked uploads with resumability so users can recover from network failures without re-uploading
- Design the transcoding pipeline as an asynchronous workflow with retries, idempotency, and progress tracking across multiple stages (decode, encode at multiple resolutions, package into HLS/DASH segments, generate thumbnails)
- Use a message queue to decouple upload completion from transcoding, enabling backpressure and independent scaling
2. CDN Strategy and Video Delivery
Serving video segments to millions of concurrent viewers requires intelligent content distribution. Cache misses during viral moments can overwhelm origin servers.
Hints to consider:
- Use origin shielding to protect storage from direct viewer requests, with dedicated shield POPs that aggregate CDN traffic
- Design cache keys that include video ID, resolution, and segment number for efficient parallel delivery of multiple bitrates
- Implement adaptive bitrate streaming (HLS or DASH) so clients automatically switch quality based on network conditions
- Handle thundering herd problems when a viral video generates millions of simultaneous requests for the same segments
3. Metadata Storage and Hot Content Handling
Video metadata, view counts, and engagement data face extreme read/write patterns, especially for trending content that creates hot keys in storage.
Hints to consider:
- Separate the metadata store (video details, creator info) from engagement counters (views, likes) since they have very different access patterns
- Use sharded counters for view counts on popular videos to avoid single-key contention, with periodic aggregation
- Cache hot video metadata in Redis with short TTLs to absorb read spikes without overwhelming the primary database
- Design the recommendation system to precompute candidate sets offline and do lightweight blending at serve time
4. Content Processing Workflow Orchestration
Transcoding, thumbnail generation, content moderation, and metadata extraction are CPU-intensive multi-minute workflows that must be orchestrated reliably.
Hints to consider:
- Model the processing pipeline as a directed acyclic graph where each stage produces outputs consumed by downstream stages
- Implement checkpointing so partially completed workflows can resume after worker failures without restarting
- Use dead-letter queues for videos that repeatedly fail processing, with alerting for manual investigation
- Design for priority levels where popular creators or time-sensitive content gets processed ahead of the queue
Suggested Approach
Step 1: Clarify Requirements
Start by confirming scope with the interviewer. Ask about target scale (uploads per day, concurrent viewers), whether live streaming is in scope or just on-demand, maximum video length and file size, and supported devices. Clarify whether the recommendation engine is in scope or treated as a black box. Confirm whether content moderation is required before publishing. Establish SLAs for upload processing time and playback start latency.