Practice/Google/Design a file system for write-once media
Design a file system for write-once media
System DesignOptional
Problem Statement
You are asked to design a file system that targets write-once media such as WORM (Write Once Read Many) drives, optical discs, or tape archives. Unlike conventional file systems that allow arbitrary in-place updates, every byte written to the underlying medium is permanent. This constraint fundamentally changes how you handle metadata updates, directory structures, and space reclamation.
The system must support creating, reading, and logically deleting files while maintaining crash-safe guarantees. Clients expect atomic publish semantics so that a partially written file never becomes visible to readers. The design should also accommodate versioning, since physical overwrites are impossible and logical deletes must coexist with immutable data already on disk.
Consider how the system organizes data into chunks, tracks them through manifests, and recovers gracefully after unexpected failures. Think about the tradeoffs between sequential write throughput, metadata lookup latency, and the overhead of maintaining an append-only index.
Key Requirements
Functional
- Atomic file publish -- A file becomes visible to readers only after all chunks and its manifest have been fully committed to the medium.
- Logical delete with tombstones -- Since bytes cannot be overwritten, deletes are recorded as tombstone entries that hide the file from directory listings while preserving the underlying data.
- File versioning -- Support multiple immutable versions of the same logical path, enabling clients to retrieve historical snapshots of a file.
- Chunk-based data layout -- Split files into fixed-size chunks with a manifest that maps logical offsets to physical chunk locations on the medium.
Non-Functional
- Scalability -- Handle petabytes of archived data across thousands of WORM volumes without degrading metadata lookup performance.
- Latency -- Metadata reads (directory listing, file lookup) should complete in single-digit milliseconds even when the underlying medium has high seek times.
- Durability -- Zero data loss; every committed write must survive power failures and medium errors through redundant metadata and checksums.
- Crash recovery -- The system must resume from any interruption point without leaving orphaned chunks or phantom file entries.
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Append-Only Data Structures
The immutability constraint means you cannot update existing blocks. Interviewers want to see how you design log-structured or append-only indexes that keep metadata current without modifying previously written sectors.
Hints to consider:
- Think about how log-structured merge trees handle compaction when compaction itself cannot reclaim physical space
- Consider maintaining an in-memory index that is periodically checkpointed as a new append-only snapshot
- Explore how a write-ahead log concept adapts when every write is inherently ahead-only
- Look at how Kafka topics use immutable segment files with offset-based indexing as a mental model
2. Atomic Publish and Crash Recovery
Interviewers probe whether partially written files can leak to readers and how you recover from mid-write crashes. The key challenge is making multi-chunk writes appear atomic on a medium that does not support transactions.
Hints to consider:
- Consider a two-phase approach: write all chunks first, then write a single manifest record that atomically makes the file visible
- Think about using a separate metadata store (like PostgreSQL or ZooKeeper) to track commit status
- Explore how you detect and clean up orphaned chunks after a crash
- Consider using checksums in the manifest to validate chunk integrity during recovery scans
3. Space Management and Logical Deletes
Since physical reclamation is impossible on write-once media, interviewers want to understand your strategy for managing ever-growing storage and how tombstones interact with versioning.
Hints to consider:
- Think about how tombstone compaction works when you can only append new tombstone records
- Consider tiered storage where cold volumes with high tombstone ratios trigger data migration to new volumes
- Explore how garbage collection metadata is itself stored on write-once media
- Discuss the tradeoff between keeping a full version history and the cost of scanning through tombstones during reads