For a full example answer with detailed architecture diagrams and deep dives, see our Design File System guide.
Design a file storage and synchronization system like Dropbox that allows users to upload, download, and sync files across multiple devices with real-time updates. Users drop a file in one place and it appears everywhere, even after going offline and reconnecting. The system must handle large files with resumable uploads, multi-device conflict resolution, and cost-efficient distribution through a CDN.
Interviewers ask this because it forces you to separate a control plane (metadata, authentication, sync) from a data plane (large file transfers), handle multi-device real-time updates, and design for large blobs, resumable transfers, conflict resolution, and efficient distribution. Strong answers demonstrate clear requirements, a scalable architecture, and practical trade-offs for reliability, consistency, and user experience.
Based on real interview experiences, these are the areas interviewers probe most deeply:
Large files over unreliable networks need multipart uploads with checksums and the ability to resume from the last successful chunk. Interviewers want to see that you avoid routing file bytes through application servers.
Hints to consider:
Multi-device sync requires a reliable mechanism for detecting changes, propagating them, and handling conflicts when the same file is edited on two devices while offline.
Hints to consider:
A key architectural decision is separating the metadata path from the file transfer path. Interviewers look for understanding of why this matters and how to implement it.
Hints to consider:
Transferring entire files on every edit wastes bandwidth. Interviewers appreciate designs that minimize data transfer through content-aware techniques.
Hints to consider:
Confirm the scope with your interviewer. Key questions include: What is the maximum file size? Is real-time collaboration (like Google Docs) in scope, or just sync? How many devices per user? Is version history required, and for how long? What are the expected read-to-write ratios? Establish whether offline editing and conflict resolution are priorities.
Sketch the major components: a client application that monitors the local file system and maintains a sync database; an API gateway for authentication, metadata operations, and sync coordination; a metadata service backed by PostgreSQL for file trees, sharing, and version history; an object storage layer (Amazon S3) for file chunks; a CDN (CloudFront) for fast downloads; a notification service using WebSockets for push-based sync triggers; and a message queue (Kafka) for propagating change events to indexing, thumbnail generation, and antivirus scanning.
Walk through the upload flow end to end. The client detects a new or modified file, splits it into chunks using content-defined chunking, computes content hashes, and queries the server for which chunks already exist. For missing chunks, the client requests pre-signed S3 URLs from the API, uploads chunks directly, and notifies the server on completion. The server atomically updates the file metadata and appends an entry to the change log. Connected devices receive a push notification, query the change log for new entries, download only the chunks they lack, and reassemble the file locally. Discuss how offline devices replay the change log on reconnection and how conflicts are detected using content hashes compared against the last-known server state.
Cover sharing by storing ACLs in PostgreSQL with permission inheritance from folders to files, enforced at every API call. Discuss search using a full-text index over file names and metadata, updated asynchronously via Kafka. Address versioning by retaining previous chunk manifests for a configurable retention period, enabling point-in-time restore. Mention security: encryption at rest in S3, encryption in transit with TLS, and scoped pre-signed URLs with short TTLs.
"OneDrive Application where we can upload, download and sync when online and how it behaves when offline. There was a lot of focus on designing the client side."
"The focus was on the synchronization across multiple devices. The interviewer pushed hard for versioning solution."
"Dropbox-like SAS service, without sync, with subscription and remove user files who haven't paid, despite multiple notifications."