Design a global-scale Live Chat System that supports 5 million concurrent WebSocket connections and must survive traffic spikes like the Super Bowl, where a single chat room can contain 5 million users sending 100 messages/second. Your architecture must guarantee at-most-once message delivery with <200 ms p95 end-to-end latency, provide presence indicators, typing indicators, read receipts, and offline message sync. You must handle the fan-out challenge of 500 million deliveries/second in the hottest room by implementing hierarchical fan-out (message → Kafka → regional relay servers → WebSocket servers → clients) and server-side message sampling that shows 100 % of messages when volume is low but only 10 % when it exceeds 1 000 msg/sec, while still prioritizing messages from followed users and moderators. Messages must appear in a consistent order to every user; use server-assigned per-room sequence numbers and client-side buffering to hide network jitter. Design for multi-region deployment, automatic horizontal scaling, and 99.99 % availability with no single point of failure. You may assume end-to-end encryption is handled by a separate team; focus on the real-time delivery, fan-out, ordering, and scaling layers.