Practice/OpenAI/Design YouTube

Design YouTube

System DesignMust

Problem Statement

Design a platform that delivers live sports events to millions of concurrent viewers with minimal delay, while also providing on-demand replays, real-time score updates, and synchronized commentary. The system must handle sudden traffic spikes when major games start, maintain consistent quality across varying network conditions, and support features like live chat and instant highlights.

Unlike on-demand video platforms, live streaming introduces unique challenges: content must be ingested in real-time from stadiums or broadcast centers, encoded and distributed with sub-second latency targets, and served to millions of viewers simultaneously watching the same event. The system must gracefully handle broadcaster interruptions, adapt to viewer bandwidth constraints, and provide frame-accurate DVR functionality. Interviewers want to see how you balance latency, scale, and cost while designing for both the live path (ingest to viewer) and the on-demand path (replay storage and delivery).

Key Requirements

Functional

Live event streaming -- ingest video feeds from multiple sources, transcode in real-time, and deliver to millions of concurrent viewers with minimal delay
Adaptive bitrate playback -- automatically adjust video quality based on viewer bandwidth and device capabilities without interrupting the stream
DVR functionality -- allow viewers to pause, rewind, and catch up during live events while maintaining sync with the live broadcast
Real-time metadata -- display synchronized scores, statistics, and commentary overlays that update as events unfold
Instant replay generation -- automatically detect and create shareable highlight clips from live streams within seconds of key moments

Non-Functional

Scalability -- support 10M+ concurrent viewers for major events with the ability to scale up within minutes of game start
Reliability -- maintain 99.9% uptime during scheduled events with automatic failover for encoder and origin failures
Latency -- deliver live video within 3-5 seconds of real-time (glass-to-glass latency) while maintaining quality
Consistency -- ensure all viewers see the same content within acceptable time windows, with synchronized metadata updates

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Real-Time Ingest and Transcoding Pipeline

The live path is fundamentally different from on-demand uploads. You must ingest a continuous stream, transcode it in real-time into multiple bitrates and formats, and package segments as they arrive -- all while maintaining sub-second processing delays.

Hints to consider:

Use protocols like RTMP or SRT for reliable ingest from broadcast centers, with primary and backup ingestion endpoints
Deploy stateful transcoding clusters that maintain encoding context across segments, with hot standbys ready to take over
Generate short segment durations (2-3 seconds) for low latency, but discuss the tradeoff with compression efficiency and CDN caching
Implement health checks that detect stalled or corrupted streams and automatically switch to backup feeds

2. CDN Strategy and Cache Warming

Live content creates unique caching challenges because everyone watches the same segments simultaneously, but those segments are constantly being created. A cache miss during a critical game moment can overwhelm your origin.

Hints to consider:

Use origin shielding to protect transcoders from direct viewer requests, with dedicated shield POPs that aggregate CDN traffic
Implement predictive cache warming that pushes new segments to CDN edges before viewers request them
Design cache keys that include stream ID and segment sequence number, allowing parallel delivery of multiple bitrates
Handle thundering herd problems when millions request the same new segment by using request coalescing at the CDN level

3. Time-Shifting and DVR Storage

Allowing viewers to pause and rewind live streams requires maintaining a sliding window of recent segments while managing storage costs and ensuring seekability.

Hints to consider:

Store recent segments (last 2-4 hours) in a distributed cache tier with TTL-based eviction, separate from long-term replay storage
Maintain parallel timelines for each viewer's playback position, with server-side tracking or client-side offsets
Use a manifest manipulation service that generates personalized playlists showing available DVR windows per viewer
Implement catch-up acceleration that temporarily increases bitrate when viewers fast-forward to live

4. Metadata Synchronization and Fan Engagement

Scores, stats, and chat messages must appear synchronized with the video stream despite variable viewer latency, network delays, and different playback positions.

Hints to consider:

Embed presentation timestamps (PTS) in metadata events that reference specific video frames, allowing client-side alignment
Use WebSocket or Server-Sent Events for pushing real-time updates, with fallback polling for clients behind restrictive networks
Implement eventual consistency for non-critical metadata like viewer counts, but provide strong consistency for scores and game state
Consider dedicated infrastructure for chat messages separate from video delivery to isolate scaling and failure domains

5. Capacity Planning and Cost Optimization

Live events create predictable traffic patterns (sudden spikes at game start, gradual decline) but with massive peak-to-trough ratios that make static provisioning expensive.

Hints to consider:

Pre-scale CDN and transcoding capacity based on event schedules, with automated ramp-up 15-30 minutes before scheduled start times
Use spot instances or preemptible VMs for transcoding workloads, with reservation-based capacity for critical backup encoders
Implement quality-based admission control that reduces bitrate options or redirects overflow traffic during extreme spikes
Archive live streams to cheaper cold storage after 24-48 hours, with separate on-demand encoding for popular replays

Practice/OpenAI/Design YouTube

Design YouTube

System DesignMust

Problem Statement

Key Requirements

Functional

Live event streaming -- ingest video feeds from multiple sources, transcode in real-time, and deliver to millions of concurrent viewers with minimal delay
Adaptive bitrate playback -- automatically adjust video quality based on viewer bandwidth and device capabilities without interrupting the stream
DVR functionality -- allow viewers to pause, rewind, and catch up during live events while maintaining sync with the live broadcast
Real-time metadata -- display synchronized scores, statistics, and commentary overlays that update as events unfold
Instant replay generation -- automatically detect and create shareable highlight clips from live streams within seconds of key moments

Non-Functional

Scalability -- support 10M+ concurrent viewers for major events with the ability to scale up within minutes of game start
Reliability -- maintain 99.9% uptime during scheduled events with automatic failover for encoder and origin failures
Latency -- deliver live video within 3-5 seconds of real-time (glass-to-glass latency) while maintaining quality
Consistency -- ensure all viewers see the same content within acceptable time windows, with synchronized metadata updates

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Real-Time Ingest and Transcoding Pipeline

Hints to consider:

Use protocols like RTMP or SRT for reliable ingest from broadcast centers, with primary and backup ingestion endpoints
Deploy stateful transcoding clusters that maintain encoding context across segments, with hot standbys ready to take over
Generate short segment durations (2-3 seconds) for low latency, but discuss the tradeoff with compression efficiency and CDN caching
Implement health checks that detect stalled or corrupted streams and automatically switch to backup feeds

2. CDN Strategy and Cache Warming

Hints to consider:

Use origin shielding to protect transcoders from direct viewer requests, with dedicated shield POPs that aggregate CDN traffic
Implement predictive cache warming that pushes new segments to CDN edges before viewers request them
Design cache keys that include stream ID and segment sequence number, allowing parallel delivery of multiple bitrates
Handle thundering herd problems when millions request the same new segment by using request coalescing at the CDN level

3. Time-Shifting and DVR Storage

Allowing viewers to pause and rewind live streams requires maintaining a sliding window of recent segments while managing storage costs and ensuring seekability.

Hints to consider:

Store recent segments (last 2-4 hours) in a distributed cache tier with TTL-based eviction, separate from long-term replay storage
Maintain parallel timelines for each viewer's playback position, with server-side tracking or client-side offsets
Use a manifest manipulation service that generates personalized playlists showing available DVR windows per viewer
Implement catch-up acceleration that temporarily increases bitrate when viewers fast-forward to live

4. Metadata Synchronization and Fan Engagement

Scores, stats, and chat messages must appear synchronized with the video stream despite variable viewer latency, network delays, and different playback positions.

Hints to consider:

Embed presentation timestamps (PTS) in metadata events that reference specific video frames, allowing client-side alignment
Use WebSocket or Server-Sent Events for pushing real-time updates, with fallback polling for clients behind restrictive networks
Implement eventual consistency for non-critical metadata like viewer counts, but provide strong consistency for scores and game state
Consider dedicated infrastructure for chat messages separate from video delivery to isolate scaling and failure domains

5. Capacity Planning and Cost Optimization

Live events create predictable traffic patterns (sudden spikes at game start, gradual decline) but with massive peak-to-trough ratios that make static provisioning expensive.

Hints to consider:

Pre-scale CDN and transcoding capacity based on event schedules, with automated ramp-up 15-30 minutes before scheduled start times
Use spot instances or preemptible VMs for transcoding workloads, with reservation-based capacity for critical backup encoders
Implement quality-based admission control that reduces bitrate options or redirects overflow traffic during extreme spikes
Archive live streams to cheaper cold storage after 24-48 hours, with separate on-demand encoding for popular replays