Hey r/DarkInterview — sharing a Netflix coding question from https://darkinterview.com .
Design and implement a key-value cache with automatic expiration. This is a classic interview question asked at Netflix (also known as the "Log Rate Limiter" problem) that tests your understanding of data structures, memory management, and system design trade-offs.
Implement an AutoExpireCache class with set and get:
`python import time
cache = AutoExpireCache()
cache.set("user_session", {"user_id": 123}, 2) print(cache.get("user_session")) # {"user_id": 123}
time.sleep(3)
print(cache.get("user_session")) # None (expired)
cache.set("feature_flag", False, 60) print(cache.get("feature_flag")) # False (not None!) `
This part is straightforward — store (value, expire_timestamp) in a dict and use lazy deletion: check expiry on get() and delete if stale.
Key design decision: Return None for cache misses, not False, since False can be a valid cached value.
> Interviewer: "If we set millions of keys that are never accessed again, they stay in memory forever. How do you fix this?"
This is where it gets interesting. Three approaches:
Run a daemon thread that scans and removes expired entries on an interval:
python def _cleanup_expired(self): current = time.time() with self.lock: expired = [k for k, (_, exp) in self.cache.items() if current > exp] for k in expired: del self.cache[k]
Trade-off: O(N) scan, but simple and effective. Requires thread safety with locks.
Use a min-heap sorted by expiry time so you only process entries that have actually expired — O(E log N) where E is expired entries:
python heapq.heappush(self.expiry_heap, (expire_at, key, version))
Trick: Use version tracking to handle stale heap entries when keys are updated with new TTLs.
| Approach | Set | Get | Cleanup | Memory Safety | |---|---|---|---|---| | Lazy only | O(1) | O(1) | N/A | ❌ | | Periodic scan | O(1) | O(1) | O(N) | ✅ | | Min-heap | O(log N) | O(1) | O(E log N) | ✅ |
> Interviewer: "What if memory is still a concern? Enforce a maximum cache size with LRU eviction."
Combine TTL expiration with LRU eviction using OrderedDict:
`python from collections import OrderedDict
class AutoExpireCache: def init(self, capacity: int): self.capacity = capacity self.cache = OrderedDict()
def set(self, key, value, ttl_seconds):
if key in self.cache:
del self.cache[key]
while len(self.cache) >= self.capacity:
self.cache.popitem(last=False) # Evict LRU
self.cache[key] = (value, time.time() + ttl_seconds)
def get(self, key):
if key not in self.cache:
return None
value, expire_at = self.cache[key]
if time.time() > expire_at:
del self.cache[key]
return None
self.cache.move_to_end(key) # Mark as recently used
return value
`
The interviewer may also ask you to implement LRU from scratch using a doubly-linked list + hash map for O(1) operations.
The interviewer may extend the conversation to:
concurrent.futures?get() refresh the TTL?mset(), mget() for multiple keys at onceFull question + Python solutions with all 3 implementation strategies: https://darkinterview.com/collections/n3f6x9k2/questions/ef32a27f-b05a-4e44-a838-1bfd7979ccaf