Practice/Uber/Design a 911 Call Center
Design a 911 Call Center
System DesignMust
Problem Statement
Design a 911 call center system that efficiently routes emergency calls to appropriate regional police stations and schedules employees into operational slots. The system must connect callers to available operators with minimal wait time, route incidents to the correct jurisdiction, and prioritize life-threatening emergencies.
A 911 call center is a real-time emergency call routing and dispatch platform where callers are connected to trained operators who triage incidents and route them to the correct regional agency as fast as possible. It also includes workforce scheduling to ensure enough operators are on shift to meet unpredictable demand.
Interviewers at Uber ask this to test whether you can design for ultra-low-latency decisioning, high availability under surges, and correct geospatial routing across jurisdictions, while also planning capacity through staff scheduling. They expect strong high-level architecture, load distribution strategies, and sound routing algorithms rather than CRUD APIs.
Key Requirements
Functional
- Call intake -- callers reach an available operator with minimal wait time when dialing 911
- Geospatial routing -- calls are automatically routed to the correct regional station or agency based on caller location, incident type, and real-time availability
- Priority handling -- life-threatening incidents are prioritized ahead of lower-urgency calls, with preemption during surges
- Staff scheduling -- operators can be staffed, scheduled, and adjusted with on-call rosters for continuous coverage
Non-Functional
- Scalability -- handle thousands of concurrent calls during mass emergency events (natural disasters, major incidents)
- Reliability -- 99.999% uptime; system must survive regional outages and partitions without losing calls
- Latency -- caller connected to operator within 10 seconds under normal load; routing decision under 1 second
- Consistency -- strong consistency for call assignment (no call assigned to two operators); eventual consistency for scheduling views
What Interviewers Focus On
Based on real interview experiences at Uber and Salesforce, these are the areas interviewers probe most deeply:
1. Call Routing and Jurisdiction Matching
Routing based solely on nearest or round-robin ignores jurisdiction boundaries and station capacity. Interviewers want a correct, adaptive routing algorithm.
Hints to consider:
- Map caller location to jurisdiction using a precomputed geospatial index (polygons for jurisdictions stored in PostGIS or a spatial cache)
- Factor in real-time station capacity: if the primary jurisdiction's station is at capacity, overflow to a neighboring station
- Consider incident type routing: medical emergencies may route to fire/EMS, not police
- Cache jurisdiction lookups in Redis for sub-millisecond decisions on the hot path
2. Overload Protection and Priority Queuing
During mass emergencies, call volume can spike 10-100x. Without explicit priority and backpressure, life-threatening calls get lost in the queue.
Hints to consider:
- Implement priority queues where life-threatening calls preempt lower-urgency ones
- Use token bucket rate limiting per jurisdiction to prevent one overloaded area from consuming all operators
- Implement circuit breakers: if a station is unreachable, reroute to backup stations rather than dropping calls
- Provide callers with position-in-queue information and estimated wait time
3. High Availability and Disaster Recovery
A 911 system must survive regional outages. Single points of failure are unacceptable.
Hints to consider:
- Deploy active-active across multiple regions with automatic failover
- Replicate call queue state across availability zones so in-progress calls survive node failures
- Design degraded-mode operations: if the routing service is unavailable, fall back to round-robin assignment
- Use persistent connections with automatic reconnection for operator workstations
4. Workforce Scheduling and Capacity Planning
Ensuring enough operators are on duty requires scheduling that accounts for historical patterns and surge capacity.
Hints to consider:
- Model scheduling as shift slots with coverage requirements per time-of-day and day-of-week based on historical call volume
- Support on-call overflow rosters that can be activated during surges
- Provide real-time dashboards showing current queue depth, operator utilization, and estimated wait times
- Feed scheduling decisions with historical data: call volume patterns by hour, day, and season