Practice/Meta/Design a Peer-to-Peer File Sharing System
Design a Peer-to-Peer File Sharing System
System DesignMust
Problem Statement
You need to design a content delivery network (CDN) capable of streaming high-definition video content to millions of concurrent users across multiple geographic regions. The system must efficiently deliver large video files (ranging from 500 MB to 5 GB) while minimizing latency, reducing origin server load, and maintaining high availability even during traffic spikes such as live event broadcasts or new content releases.
Your design should handle 10 million concurrent viewers with peak aggregate bandwidth of 500 Tbps, serve content with p95 latency under 100ms, and ensure graceful degradation when edge nodes fail. Consider how content is ingested, replicated across edge locations, cached intelligently, and delivered to end users through adaptive bitrate streaming. The interviewer expects you to reason about bandwidth costs, cache invalidation strategies, and techniques for optimizing the last-mile delivery.
Key Requirements
Functional
- Content ingestion -- system must accept video uploads from content providers, process them into multiple quality tiers, and distribute to edge locations
- Geographic distribution -- content must be replicated across edge servers in multiple regions to serve users from nearby locations
- Adaptive streaming -- clients should receive video segments at appropriate bitrates based on their network conditions
- Cache management -- system must intelligently cache popular content while evicting stale or unpopular items to optimize storage utilization
Non-Functional
- Scalability -- support 10 million concurrent streams with aggregate bandwidth of 500 Tbps during peak hours
- Reliability -- maintain 99.99% availability with automatic failover when edge nodes become unavailable
- Latency -- deliver content with p95 latency under 100ms from request to first byte
- Consistency -- ensure cache invalidation completes within 60 seconds globally when content is updated or removed
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Edge Node Architecture and Cache Strategy
Your CDN's performance hinges on how efficiently edge nodes cache and serve content. Interviewers want to understand your cache eviction policy, how you handle cold starts, and strategies for pre-warming caches with anticipated popular content.
Hints to consider:
- Discuss tiered caching with memory-based hot cache and disk-based warm storage
- Consider LRU with frequency boosting for viral content or predictive pre-fetching based on content popularity trends
- Explain how cache hit ratios directly impact origin load and overall system cost
- Address cache coherence when content is updated or taken down for policy violations
2. Origin Shield and Request Routing
Preventing cache stampedes and efficiently routing requests to healthy, nearby edge nodes is critical. Interviewers expect you to design an intelligent routing layer that balances load while minimizing latency.
Hints to consider:
- Implement an origin shield tier between edge nodes and origin to collapse duplicate requests
- Use DNS-based geo-routing combined with anycast for coarse-grained direction to nearest edge clusters
- Discuss consistent hashing for distributing content across edge nodes within a region
- Consider real-time health checks and traffic shifting when nodes experience degraded performance or failures
3. Video Segmentation and Adaptive Bitrate Delivery
Streaming large video files requires chunking content into smaller segments and enabling clients to adapt quality dynamically. Interviewers look for understanding of HLS or DASH protocols and how segment size affects buffering and seek performance.
Hints to consider:
- Segment videos into 2-10 second chunks encoded at multiple bitrates (e.g., 360p, 720p, 1080p, 4K)
- Generate manifest files that clients use to request appropriate quality segments based on bandwidth estimation
- Discuss tradeoffs between segment duration (longer reduces overhead, shorter enables faster bitrate switching)
- Address how you handle live streaming with lower latency constraints compared to video-on-demand
4. Content Propagation and Consistency
Getting new or updated content from origin to all edge locations quickly while maintaining consistency is challenging at global scale. Interviewers want to see you balance propagation speed with network efficiency.
Hints to consider:
- Use a hierarchical propagation model with regional hubs that fan out to local edge nodes
- Implement content versioning with immutable object keys to avoid cache invalidation races
- Discuss pull-based lazy loading versus push-based eager replication based on predicted demand
- Address purge propagation for emergency content takedowns using a pub-sub notification system
5. Cost Optimization and Bandwidth Management
Operating a CDN at this scale incurs massive bandwidth and storage costs. Interviewers expect you to discuss strategies for reducing expenses without sacrificing user experience.
Hints to consider:
- Negotiate peering agreements and use multi-CDN strategies to optimize bandwidth costs across providers
- Implement intelligent cache warming to reduce origin egress charges during content launches
- Discuss compression techniques and serving lower bitrates to users on metered connections
- Consider edge compute for on-the-fly transcoding versus pre-encoding all quality variants