Practice/LinkedIn/Design a Database like Bitcask

Design a Database like Bitcask

System DesignOptional

Problem Statement

Design a log-structured key-value storage engine similar to Bitcask that supports fast writes, efficient reads, and crash recovery with minimal disk seeks. The system uses append-only writes, an in-memory index for O(1) lookups, and background compaction to reclaim space from obsolete entries.

The challenge is reasoning about storage engine fundamentals: I/O patterns (sequential vs random), write-ahead logging, crash recovery, on-disk formats, compaction/GC, and the trade-offs between latency, throughput, memory footprint, and data integrity.

Key Requirements

Functional

Fast writes -- users write or update a key with a value and receive an acknowledgment with low latency
Predictable reads -- users read the latest value for a key with consistent performance even as data grows
Delete support -- users delete a key and subsequent reads reflect the deletion
Crash recovery -- acknowledged writes survive crashes and the store recovers quickly to a consistent state

Non-Functional

Scalability -- handle millions of keys with values of varying sizes; throughput scales with disk I/O bandwidth
Reliability -- no data loss for acknowledged writes; recovery completes in seconds even with large datasets
Latency -- sub-millisecond writes (sequential append); sub-millisecond reads for small values (single seek via index)
Efficiency -- minimize disk space waste through background compaction; bound in-memory index size

Interview Reports from Hello Interview

1 report from a candidate. Most recently asked at LinkedIn in Early December 2025.

This question is primarily asked at LinkedIn.

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Append-Only Write Path

The core of Bitcask is sequential, append-only writes that maximize disk throughput. Interviewers expect a clear on-disk format.

Hints to consider:

Each record in the data file contains: CRC checksum, timestamp, key size, value size, key bytes, value bytes
Append all writes (including updates and deletes) to the active data file sequentially
Use group commit (batch multiple writes before calling fsync) to amortize disk sync cost
Rotate to a new active file when the current one exceeds a size threshold (e.g., 64MB)

2. In-Memory Index (Keydir)

Fast reads depend on an in-memory hash map that maps every key to its on-disk location. Interviewers want to understand the memory implications.

Hints to consider:

The keydir maps each key to (file_id, value_offset, value_size, timestamp)
Reads are O(1): look up the key in the keydir, seek to the offset in the correct file, read the value
Memory usage is proportional to the number of unique keys (not total data size)
For datasets where keys don't fit in memory, consider a hybrid approach with an on-disk index or range partitioning

3. Compaction and Merge Process

Without compaction, disk space grows forever as old values are never reclaimed. Interviewers want a concrete algorithm.

Hints to consider:

Select inactive (non-active) data files for compaction based on the ratio of live vs dead entries
Read selected files sequentially, check each entry against the keydir (is this the latest version?), and write live entries to a new compacted file
Generate a hint file alongside each compacted file containing only key-to-offset mappings for fast recovery
Swap compacted files atomically (rename) and delete old files; update the keydir to point to new locations

4. Crash Recovery

The system must restart quickly and recover to a consistent state without losing acknowledged writes.

Hints to consider:

On startup, replay hint files (if available) to rebuild the keydir without reading full data files
If hint files are missing, scan data files from oldest to newest, building the keydir by keeping the latest entry per key
Validate CRC checksums during recovery to detect and skip corrupted entries
Keep recovery time bounded by limiting data file sizes and maintaining up-to-date hint files after compaction

5. Tombstone Handling for Deletes

Deletes must be handled correctly in a log-structured system where old entries are not overwritten.

Hints to consider:

Write a special tombstone record (key with empty value or a delete flag) on delete
The keydir removes the key entry so reads return "not found" immediately
Compaction filters out tombstones (and the values they obsolete) from older files
Retain tombstones for a configurable period to ensure they propagate through compaction of all files

Practice/LinkedIn/Design a Database like Bitcask

Design a Database like Bitcask

System DesignOptional

Problem Statement

Key Requirements

Functional

Fast writes -- users write or update a key with a value and receive an acknowledgment with low latency
Predictable reads -- users read the latest value for a key with consistent performance even as data grows
Delete support -- users delete a key and subsequent reads reflect the deletion
Crash recovery -- acknowledged writes survive crashes and the store recovers quickly to a consistent state

Non-Functional

Scalability -- handle millions of keys with values of varying sizes; throughput scales with disk I/O bandwidth
Reliability -- no data loss for acknowledged writes; recovery completes in seconds even with large datasets
Latency -- sub-millisecond writes (sequential append); sub-millisecond reads for small values (single seek via index)
Efficiency -- minimize disk space waste through background compaction; bound in-memory index size

Interview Reports from Hello Interview

1 report from a candidate. Most recently asked at LinkedIn in Early December 2025.

This question is primarily asked at LinkedIn.

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Append-Only Write Path

The core of Bitcask is sequential, append-only writes that maximize disk throughput. Interviewers expect a clear on-disk format.

Hints to consider:

Each record in the data file contains: CRC checksum, timestamp, key size, value size, key bytes, value bytes
Append all writes (including updates and deletes) to the active data file sequentially
Use group commit (batch multiple writes before calling fsync) to amortize disk sync cost
Rotate to a new active file when the current one exceeds a size threshold (e.g., 64MB)

2. In-Memory Index (Keydir)

Fast reads depend on an in-memory hash map that maps every key to its on-disk location. Interviewers want to understand the memory implications.

Hints to consider:

The keydir maps each key to (file_id, value_offset, value_size, timestamp)
Reads are O(1): look up the key in the keydir, seek to the offset in the correct file, read the value
Memory usage is proportional to the number of unique keys (not total data size)
For datasets where keys don't fit in memory, consider a hybrid approach with an on-disk index or range partitioning

3. Compaction and Merge Process

Without compaction, disk space grows forever as old values are never reclaimed. Interviewers want a concrete algorithm.

Hints to consider:

Select inactive (non-active) data files for compaction based on the ratio of live vs dead entries
Read selected files sequentially, check each entry against the keydir (is this the latest version?), and write live entries to a new compacted file
Generate a hint file alongside each compacted file containing only key-to-offset mappings for fast recovery
Swap compacted files atomically (rename) and delete old files; update the keydir to point to new locations

4. Crash Recovery

The system must restart quickly and recover to a consistent state without losing acknowledged writes.

Hints to consider:

On startup, replay hint files (if available) to rebuild the keydir without reading full data files
If hint files are missing, scan data files from oldest to newest, building the keydir by keeping the latest entry per key
Validate CRC checksums during recovery to detect and skip corrupted entries
Keep recovery time bounded by limiting data file sizes and maintaining up-to-date hint files after compaction

5. Tombstone Handling for Deletes

Deletes must be handled correctly in a log-structured system where old entries are not overwritten.

Hints to consider:

Write a special tombstone record (key with empty value or a delete flag) on delete
The keydir removes the key entry so reads return "not found" immediately
Compaction filters out tombstones (and the values they obsolete) from older files
Retain tombstones for a configurable period to ensure they propagate through compaction of all files