For a walkthrough of designing a distributed file system with sync semantics, see our Design File System guide. It covers metadata management, chunked storage, and multi-device synchronization patterns that form the foundation for this problem.
Also review the Blob Storage, Databases, Message Queues, and CDN building blocks.
Design a file storage and synchronization service like Dropbox that allows users to upload, download, and sync files across multiple devices with real-time updates. The core experience is seamless: drop a file into a folder on one device and it appears on every other device within seconds, even if some devices were offline when the change happened.
This problem forces you to separate a control plane (metadata, authentication, sync coordination) from a data plane (large file transfers). You need to handle multi-device real-time updates, large blob uploads with resumability, conflict resolution when two devices edit the same file offline, and cost-efficient distribution through CDN and object storage. Interviewers use this question to evaluate your ability to decompose a seemingly simple product into well-bounded services with clear consistency and durability guarantees.
Based on real interview experiences, these are the areas interviewers probe most deeply:
The most common mistake is routing file data through application servers. Interviewers immediately probe whether you understand that metadata operations and bulk data transfers must travel separate paths.
Hints to consider:
Users upload multi-gigabyte files over unreliable networks. Without chunking and resume capability, a single dropped packet near the end of a large upload forces the user to start over.
Hints to consider:
When a file changes on one device, all other linked devices must learn about it quickly and pull the updated content. Polling is wasteful and slow; pure push is fragile at scale.
Hints to consider:
Two devices may edit the same file while offline. When both reconnect, the system must decide how to handle the conflict without losing either version.
Hints to consider:
At petabyte scale, storage cost dominates. Interviewers expect you to think about deduplication, tiered storage, and lifecycle policies.
Hints to consider:
Confirm the expected file size distribution (documents versus media versus archives), whether real-time collaboration on file contents is needed or just file-level sync, how many devices a typical user links, and whether the system is consumer-facing or enterprise. Ask about retention policies, version history depth, and sharing model (internal teams versus public links). Establish offline requirements: do users need to pre-select files for offline access, or should the system sync everything automatically?
Sketch the core components: a metadata service backed by a relational database (Postgres) for folder trees, permissions, and file version records; an object storage layer (S3) for file chunks; a sync service that maintains a per-user change log and pushes notifications to connected devices; an upload service that generates pre-signed URLs and tracks chunk progress; a notification service that alerts devices of new changes via WebSocket; and a background processing pipeline for thumbnails, virus scanning, and search indexing. Show the clear separation between metadata API calls and direct-to-S3 data transfers.
Walk through the end-to-end upload flow. The client splits the file into chunks, computes a hash per chunk, and asks the metadata service which chunks already exist (deduplication check). For new chunks, the client receives pre-signed upload URLs and uploads directly to S3. Once all chunks land, the client notifies the metadata service, which atomically creates a new file version record and appends an entry to the user's change log. The sync service pushes a notification to all other connected devices. Each device fetches the change log from its last cursor position, discovers the new file version, downloads the new or changed chunks via pre-signed download URLs, and reconstructs the file locally.
Cover reliability by replicating metadata across database replicas with automatic failover, and by relying on S3's built-in durability for file data. Discuss search indexing by extracting text content from uploaded files and feeding it to an Elasticsearch cluster via a Kafka-powered pipeline. Address sharing by storing ACLs in the metadata database and checking permissions on every API call and pre-signed URL generation. Touch on monitoring: track sync lag per device, upload success rates, deduplication ratios, and storage growth. If time permits, discuss multi-region deployment where metadata is replicated globally and file chunks are cached at edge locations via CDN for faster downloads.
Candidates at Miro report that the interviewer focused heavily on the synchronization mechanism across multiple devices, asking detailed questions about how versioning works and how conflicts are detected and resolved. Another common emphasis is on strong consistency for metadata operations -- interviewers want to see that you understand why ACID properties matter for operations like rename, move, and permission changes even when file sync itself can be eventually consistent.