Practice/xAI/Durable KV Cache
CodingMust
Design and implement a durable key-value cache that persists data to storage, ensuring that cached data survives process restarts or crashes. Unlike a traditional in-memory cache that loses all data on failure, a durable cache can recover its state by loading data from persistent storage.
This pattern is commonly used in:
`
cache = DurableKVCache(storage_path="cache_data.txt")
cache.put("user:123", '{"name": "Alice", "role": "admin"}') cache.put("config:theme", "dark")
print(cache.get("user:123")) # {"name": "Alice", "role": "admin"}
cache2 = DurableKVCache(storage_path="cache_data.txt") print(cache2.get("user:123")) # {"name": "Alice", "role": "admin"} (recovered!) print(cache2.get("config:theme")) # dark `
Your implementation should support:
get(key): Retrieve the value associated with the key. Returns None if the key does not exist.put(key, value): Insert or update a key-value pair, ensuring durability.put() block until data is persisted, or can it be asynchronous?Implement a DurableKVCache class that uses an append-only log strategy: every put() operation appends a new entry to a log file. On initialization, the cache replays the entire log to restore state.
Note on Terminology: This approach uses an append-only log (also called log-structured storage). Each write appends a record to the log, and the log itself is the persistent storage. This differs from write-through caching (which rewrites the entire file) and Write-Ahead Logging (WAL), which logs operations before applying them to a separate data store.
The key insight is that appending is fast (sequential I/O), but stale entries accumulate over time.
` cache = DurableKVCache("cache.txt") cache.put("key1", "value1") cache.put("key1", "value2") # Update
`
key\tvalue\nget() operationsput() to the log file immediatelyAs the cache accumulates updates, the storage file grows with stale (outdated) entries. When a key is updated multiple times, all previous values remain in storage, wasting disk space and slowing down recovery time.
` cache = DurableKVCache("cache.txt") cache.put("counter", "1") # Storage: counter=1 cache.put("counter", "2") # Storage: counter=1, counter=2 cache.put("counter", "3") # Storage: counter=1, counter=2, counter=3
`
What are the tradeoffs of the append-only log approach?
How would you handle stale data?
When should compaction happen?
Add a compaction_threshold parameter and trigger compaction after a certain number of writes:
` cache = DurableKVCache("cache.txt", compaction_threshold=100)
`