Practice/Amazon/Design DropBox

Design DropBox

System DesignMust

Problem Statement

Design a file storage and synchronization system like Dropbox that allows users to upload, download, and sync files across multiple devices with real-time updates. Users drop a file in one place and it appears everywhere, even if they go offline and come back later. The system must handle large file transfers reliably, keep metadata consistent across devices, and support sharing with permission controls.

Interviewers ask this because it forces you to separate a control plane (metadata, auth, sync) from a data plane (large file transfers), handle multi-device real-time updates, and design for large blobs, resumable transfers, conflict resolution, and cost-efficient distribution through CDNs. Strong answers demonstrate clear requirements, a scalable architecture, and practical tradeoffs for reliability, consistency, and user experience.

Key Requirements

Functional

Reliable file upload -- users upload files of any size with progress indication and the ability to pause, resume, and recover from interrupted transfers
File browsing and download -- users browse, search, and download their files quickly across devices with accurate, up-to-date listings
Multi-device sync -- files automatically synchronize across all devices, reflecting creates, updates, renames, and deletes, even after being offline
Sharing and permissions -- users share files or folders with others and control access (view or edit) for collaborators and public links

Non-Functional

Scalability -- support hundreds of millions of users with billions of files; handle upload bursts during business hours
Reliability -- no file data loss even during network interruptions or server failures; guarantee eventual delivery of all sync events
Latency -- file listings load within 300ms; sync notifications propagate to connected devices within 2 seconds
Consistency -- strong consistency for metadata operations (moves, renames, permission changes); eventual consistency acceptable for sync propagation

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Large File Upload Architecture

Files are frequently multi-GB and uploaded over unreliable networks. Interviewers want to see a clear separation between the control plane and data plane, with the application server never touching file bytes.

Hints to consider:

Use pre-signed URLs to upload file chunks directly to object storage (S3), keeping app servers out of the data path
Support multipart uploads with content-addressable chunks (hash-based deduplication) so identical blocks across files are stored once
Implement resumable uploads where each chunk is individually acknowledged, enabling recovery from any interruption point
Trigger post-upload processing (virus scanning, thumbnail generation) via events after the final chunk is confirmed

2. Multi-Device Sync Protocol

Users expect files to appear on all their devices within seconds. Interviewers probe your sync architecture, especially for offline scenarios and conflict resolution.

Hints to consider:

Maintain a durable change log (event stream) where every file operation is recorded with a monotonically increasing sequence number
Each device tracks a per-device cursor (last processed sequence number) and requests changes since that cursor on reconnect
Use a single persistent WebSocket connection per client for push notifications of new changes, with delta-sync on reconnection
Handle conflicts (two devices modify the same file offline) with a deterministic policy -- last-write-wins with the losing version saved as a conflict copy

3. Metadata Consistency and File Operations

File operations like move, rename, and permission changes must be strongly consistent. Interviewers look for transactional guarantees in your metadata layer.

Hints to consider:

Store directory trees, file versions, ACLs, and share links in a relational database (PostgreSQL) with ACID transactions
Model the file system as a hierarchy with parent pointers and use transactions for moves/renames that update both source and destination
Publish file-change events to Kafka after committing metadata changes (transactional outbox pattern) for downstream sync, indexing, and notification
Version files with content hashes to enable deduplication and efficient diff detection

4. Content Delivery and Cost Optimization

Serving file downloads at global scale requires CDN-backed delivery and smart storage tiering. Interviewers want to see cost-conscious design.

Hints to consider:

Serve downloads through CDN with signed URLs that expire, preventing unauthorized access
Implement storage tiering: frequently accessed files on standard S3, older files on infrequent-access or glacier-class storage
Use content-addressable storage with deduplication at the chunk level to reduce storage costs across users
Cache hot file metadata in Redis to reduce database load during burst access patterns

Suggested Approach

Step 1: Clarify Requirements

Start by confirming scope and constraints. Ask about expected file sizes (typical and maximum), whether real-time collaboration on files (like Google Docs) is in scope, the number of devices per user, and offline duration expectations. Clarify sharing requirements: individual files only, or entire folder hierarchies? Determine if version history is required and for how long. Establish sync latency targets and consistency requirements for different operations.

Practice/Amazon/Design DropBox

Design DropBox

System DesignMust

Problem Statement

Key Requirements

Functional

Reliable file upload -- users upload files of any size with progress indication and the ability to pause, resume, and recover from interrupted transfers
File browsing and download -- users browse, search, and download their files quickly across devices with accurate, up-to-date listings
Multi-device sync -- files automatically synchronize across all devices, reflecting creates, updates, renames, and deletes, even after being offline
Sharing and permissions -- users share files or folders with others and control access (view or edit) for collaborators and public links

Non-Functional

Scalability -- support hundreds of millions of users with billions of files; handle upload bursts during business hours
Reliability -- no file data loss even during network interruptions or server failures; guarantee eventual delivery of all sync events
Latency -- file listings load within 300ms; sync notifications propagate to connected devices within 2 seconds
Consistency -- strong consistency for metadata operations (moves, renames, permission changes); eventual consistency acceptable for sync propagation

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Large File Upload Architecture

Hints to consider:

Use pre-signed URLs to upload file chunks directly to object storage (S3), keeping app servers out of the data path
Support multipart uploads with content-addressable chunks (hash-based deduplication) so identical blocks across files are stored once
Implement resumable uploads where each chunk is individually acknowledged, enabling recovery from any interruption point
Trigger post-upload processing (virus scanning, thumbnail generation) via events after the final chunk is confirmed

2. Multi-Device Sync Protocol

Users expect files to appear on all their devices within seconds. Interviewers probe your sync architecture, especially for offline scenarios and conflict resolution.

Hints to consider:

Maintain a durable change log (event stream) where every file operation is recorded with a monotonically increasing sequence number
Each device tracks a per-device cursor (last processed sequence number) and requests changes since that cursor on reconnect
Use a single persistent WebSocket connection per client for push notifications of new changes, with delta-sync on reconnection
Handle conflicts (two devices modify the same file offline) with a deterministic policy -- last-write-wins with the losing version saved as a conflict copy

3. Metadata Consistency and File Operations

File operations like move, rename, and permission changes must be strongly consistent. Interviewers look for transactional guarantees in your metadata layer.

Hints to consider:

Store directory trees, file versions, ACLs, and share links in a relational database (PostgreSQL) with ACID transactions
Model the file system as a hierarchy with parent pointers and use transactions for moves/renames that update both source and destination
Publish file-change events to Kafka after committing metadata changes (transactional outbox pattern) for downstream sync, indexing, and notification
Version files with content hashes to enable deduplication and efficient diff detection

4. Content Delivery and Cost Optimization

Serving file downloads at global scale requires CDN-backed delivery and smart storage tiering. Interviewers want to see cost-conscious design.

Hints to consider:

Serve downloads through CDN with signed URLs that expire, preventing unauthorized access
Implement storage tiering: frequently accessed files on standard S3, older files on infrequent-access or glacier-class storage
Use content-addressable storage with deduplication at the chunk level to reduce storage costs across users
Cache hot file metadata in Redis to reduce database load during burst access patterns