For a full example answer with detailed architecture diagrams and deep dives, see our Design a Distributed File System guide. The file system guide covers chunked storage, metadata management, and synchronization patterns that form the foundation of a Dropbox-like service.
Also review the Blob Storage, Message Queues, and Databases building blocks for background on large object storage, change event propagation, and metadata consistency.
Design a file storage and synchronization system like Dropbox that allows users to upload, download, and sync files across multiple devices with real-time updates. The core experience is effortless: drop a file in one place and it appears everywhere, even if you go offline and come back later.
The key architectural challenge is separating the control plane (metadata, authentication, sync coordination) from the data plane (large file transfers). You must handle multi-device real-time synchronization, large binary uploads with resume capability, conflict resolution when the same file is edited on two devices while offline, and cost-efficient content distribution. Interviewers want to see clear requirements, a scalable architecture with distinct metadata and blob paths, and practical tradeoffs for reliability, consistency, and user experience.
Based on real interview experiences, these are the areas interviewers probe most deeply:
Files are frequently multi-gigabyte and uploaded over unreliable networks. Interviewers want to see how you avoid turning application servers into a bottleneck and ensure uploads survive interruptions.
Hints to consider:
Keeping files in sync across multiple devices is the defining feature. Interviewers expect a concrete sync mechanism with cursors, change logs, and conflict handling.
Hints to consider:
Interviewers look for a clean separation between the control plane handling file metadata and the data plane handling actual file bytes. Mixing them creates scaling bottlenecks.
Hints to consider:
When two users edit the same shared file, or a single user edits on two offline devices, the system must handle conflicts gracefully. Interviewers probe whether you have a concrete strategy.
Hints to consider:
Start by confirming scope and constraints. Ask about the expected file size distribution (many small files or some very large ones), the number of devices per user, and whether real-time collaboration on the same file is required or just sync-on-save. Clarify whether sharing is public links only or includes fine-grained ACLs. Verify the acceptable sync delay and whether offline editing with conflict resolution is a hard requirement. Establish durability and availability SLAs.
Sketch the major components: a Metadata Service (PostgreSQL) that manages the file tree, permissions, versions, and chunk manifests; an Upload Service that generates pre-signed URLs for direct-to-S3 chunk uploads and orchestrates multi-step upload workflows; a Sync Service that maintains per-device cursors and pushes change notifications via WebSocket or long-polling; an Event Bus (Kafka) that emits file-change events for background processors; a Search Service for file name indexing; and a CDN for fast downloads. Show two distinct data flows: the upload path (client to S3 via pre-signed URL, then metadata commit) and the sync path (change event to Kafka, fan-out to connected devices).
Walk through the upload flow for a large file. The client splits the file into chunks, computes a SHA-256 hash per chunk, and requests upload URLs from the Upload Service. For each chunk, the client uploads directly to S3 using the pre-signed URL. Once all chunks are uploaded, the client sends a commit request to the Metadata Service, which atomically creates the file record with the chunk manifest and increments the user's change log sequence number. The Sync Service detects the new change log entry and pushes a notification to the user's other connected devices, which fetch the updated metadata and download chunks they do not already have locally. Discuss how deduplication works: if a chunk hash already exists in storage, skip the upload and just reference the existing chunk.
Cover conflict resolution: when two devices upload different versions of the same file, the second upload detects a version mismatch and creates a conflict copy. Discuss sharing: store ACLs in the metadata database and check permissions on every API call; for shared folders, propagate changes to all participants' change logs. Address search: index file names and paths in Elasticsearch with user permission filtering. Mention background processing: use Kafka consumers for thumbnail generation, antivirus scanning, and storage quota enforcement. Cover monitoring: track upload success rates, sync latency, chunk deduplication ratio, and storage growth. Discuss disaster recovery: replicate metadata across availability zones and rely on S3's built-in durability for file content.
"Design a file sharing system. Interviewer indexed on strong consistency and ACID properties."
"OneDrive application where we can upload, download and sync when online and how it behaves when offline. There was a lot of focus on designing the client side."