Design a distributed caching system that can serve data with low latency across multiple geographic locations, handling cache misses, replication, and scaling to support billions of records. The system must support basic key-value operations (GET, SET, DELETE) with TTL-based expiration and an LRU eviction policy when memory is full. Data must be distributed across nodes using consistent hashing with virtual nodes to minimize rebalancing when nodes are added or removed. For fault tolerance, replicate each key to multiple nodes (replication factor ≥ 2) and automatically failover to replicas when a node fails. Ensure the cache remains highly available (99.99 % uptime) with sub-5 ms p99 latency for cache hits at 1 million QPS. Choose and justify a consistency model (cache-aside, write-through, or write-behind) and describe how stale data is prevented via TTL, explicit invalidation, or pub/sub cross-node invalidation. The design should scale horizontally: adding nodes increases capacity linearly with minimal data movement. Finally, outline how you would monitor cache hit ratio, eviction rate, replication lag, and node health in production.