Practice/Salesforce/Design a 911 Call Center
Design a 911 Call Center
System DesignMust
Problem Statement
Design a real-time dispatch system for a logistics company managing thousands of delivery vehicles across multiple cities. When customers place orders, the system must instantly assign the optimal driver based on location, vehicle capacity, traffic conditions, and driver availability. The platform also needs to handle route optimization, driver check-ins, real-time status updates, and dynamic reassignment when delays or cancellations occur.
Your system should support 10,000+ active drivers during peak hours, process 500+ dispatch requests per second, and handle sudden demand spikes during holidays or promotional events. The dispatch decision must happen in under 200ms to maintain competitive delivery times. Consider scenarios like driver unavailability, vehicle breakdowns, traffic congestion, and partial network outages that could affect certain geographic regions.
Key Requirements
Functional
- Instant driver assignment -- Match incoming delivery requests to available drivers within 200ms based on proximity, capacity, and constraints
- Real-time location tracking -- Continuously monitor driver positions with 5-10 second update intervals to enable accurate routing decisions
- Dynamic reassignment -- Automatically redistribute pending deliveries when drivers become unavailable or routes change
- Route optimization -- Calculate efficient multi-stop routes considering traffic, time windows, and vehicle capacity
- Driver schedule management -- Track shift patterns, breaks, availability windows, and support manual overrides by dispatchers
Non-Functional
- Scalability -- Handle 10,000 concurrent drivers, 500 dispatch requests/sec, and 100,000 location updates/minute across multiple metropolitan areas
- Reliability -- Maintain 99.99% uptime with automatic failover; no dispatch should be lost even during partial system failures
- Latency -- Complete driver matching in under 200ms; route calculations under 500ms; location updates reflected within 2 seconds
- Consistency -- Ensure no driver receives duplicate assignments; maintain eventually consistent views of driver status across regions
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Matching Algorithm and Geospatial Indexing
The core challenge is instantly finding the best available driver among thousands of candidates scattered across a city. Naive approaches that scan all drivers or use simple distance calculations will not scale to the required latency budget.
Hints to consider:
- Use spatial indexing structures like geohashes or quadtrees to partition drivers into grid cells and query only nearby candidates
- Consider pre-filtering by capacity and availability before running expensive distance calculations
- Discuss trade-offs between accuracy and speed when using approximations like Manhattan distance versus exact driving distance
- Plan for how the index stays synchronized as drivers move continuously
2. Real-Time State Management and Contention
Thousands of drivers update their locations every few seconds while dispatch decisions read and modify driver availability states. Concurrent dispatch requests might compete for the same driver, requiring careful coordination to prevent double-booking.
Hints to consider:
- Use optimistic locking or compare-and-swap operations to claim drivers atomically without global locks
- Consider caching driver states in memory stores like Redis with TTL-based invalidation for stale data
- Implement a reservation mechanism where drivers are temporarily locked during the matching process
- Discuss how to handle the race condition when multiple dispatchers try to assign the same driver simultaneously
3. Handling Overload and Prioritization
During peak events or outages in specific regions, the system may receive more requests than it can immediately fulfill. Without explicit backpressure and prioritization, the system could thrash or starve high-priority deliveries.
Hints to consider:
- Implement priority queues that favor time-sensitive deliveries (food, medicine) over standard packages
- Use circuit breakers to prevent cascading failures when downstream routing services slow down
- Consider a two-tier architecture where fast-path matching handles 90% of cases and complex cases queue for slower processing
- Design graceful degradation modes that fall back to simpler algorithms when latency budgets are exceeded
4. Multi-Region Architecture and Failover
Logistics operations span multiple cities and regions, each potentially operating semi-independently. The system must continue functioning when one region's infrastructure fails or experiences network partitions.
Hints to consider:
- Partition data by geographic region to minimize cross-region coordination and latency
- Deploy dispatch logic close to drivers using regional service clusters with local state
- Plan for how pending deliveries in a failed region can be picked up by neighboring regions or failover clusters
- Discuss consistency trade-offs when driver state must be replicated across regions for redundancy
Suggested Approach
Step 1: Clarify Requirements
Start by confirming the scope and constraints with your interviewer. Ask about the expected scale (number of drivers, requests per second, geographic coverage), latency targets for matching decisions, and whether the system needs to handle multi-stop routes or just single pickup-to-delivery assignments. Clarify if drivers have different vehicle types or capacity constraints, and whether there are priority tiers for deliveries. Confirm how often drivers update their location and what happens when drivers go offline unexpectedly. Understanding these parameters will guide your architecture choices.