Practice/Uber/Design a Real-time Traffic Monitoring System
Design a Real-time Traffic Monitoring System
System DesignMust
Problem Statement
Design a system that allows users to view real-time traffic conditions, congestion levels, and road status across a city by collecting and processing data from multiple sources like GPS devices, traffic cameras, and sensors. Users pan and zoom a map, watch color-coded roads update every few seconds, and get route ETAs that adapt to current conditions -- think Google Maps or Waze live traffic.
The system ingests high-velocity streams of geospatial and status data from diverse sources, maintains a consistent view of road segment speeds, handles out-of-order and noisy data, and delivers low-latency updates to millions of concurrent users. It must gracefully handle peak periods (rush hour) while ensuring data accuracy and reliability.
Interviewers at Uber ask this because it stresses end-to-end thinking across high-velocity data ingestion, stream processing, geospatial indexing, fan-out to many clients, and correctness under out-of-order and noisy data. You must balance latency vs accuracy, design for hotspots, and plan for resilience and data quality.
Key Requirements
Functional
- Live traffic map -- users view a city map that shows traffic speed, congestion level, and incidents in near real time
- Route ETA -- users search a route and receive current ETA and delay estimates based on live conditions
- Road events -- users see accidents, closures, and construction with location and recency information
- Live viewport updates -- users receive live updates for their current viewport without refreshing the page
Non-Functional
- Scalability -- handle millions of GPS pings per minute from vehicles, thousands of sensor readings, and millions of concurrent map viewers
- Reliability -- maintain 99.9% uptime for map queries with graceful degradation to cached results when live data is unavailable
- Latency -- reflect traffic changes on the map within 5 seconds of receiving data; map tile loads under 500 ms
- Consistency -- eventual consistency acceptable for traffic display; road event status changes should propagate within seconds
What Interviewers Focus On
Based on real interview experiences at Uber, Oracle, and Google, these are the areas interviewers probe most deeply:
1. Telemetry Ingestion and Stream Processing
The system receives events from thousands of sources with varying reliability and clock synchronization. Interviewers want to see how you handle out-of-order arrival, deduplicate events, and compute per-road-segment speeds.
Hints to consider:
- Ingest GPS pings and sensor data into Kafka partitioned by geographic region or road segment ID
- Use Flink with event-time windows and watermarks to compute per-segment average speeds over rolling windows (e.g., 1-minute tumbling windows)
- Apply spatial filtering to map GPS coordinates to road segments using a road network graph
- Handle late-arriving and duplicate data through allowed lateness windows and idempotency keys
2. Geospatial Data Model and Indexing
Traffic data must be queryable by geographic region and zoom level. Interviewers expect you to model road segments efficiently and support spatial queries for map tile generation.
Hints to consider:
- Divide the map into tiles using a quadtree or geohash-based tiling system aligned with standard map tile coordinates
- Store per-segment aggregates (speed, congestion level) in Redis keyed by segment ID with short TTLs
- Pre-aggregate tile-level summaries for common zoom levels to avoid per-request computation
- Use a geospatial index to map GPS coordinates to the nearest road segment efficiently
3. Real-time Update Fan-out to Clients
When traffic conditions change, millions of users viewing the affected area need updated tiles. Interviewers probe how you minimize latency and resource usage while handling connection churn.
Hints to consider:
- Use WebSockets or Server-Sent Events for push-based delivery of tile updates to active viewers
- Subscribe clients to tile IDs for their current viewport; only push updates for tiles that actually changed
- Coalesce rapid updates (multiple segment changes within the same tile) into periodic tile refreshes (every 5-10 seconds)
- Use a pub/sub layer (Redis Pub/Sub or NATS) to route tile change notifications to the correct WebSocket servers
4. ETA Calculation
Route-level ETA requires combining current segment speeds across a path. Interviewers look for how you balance accuracy with latency.
Hints to consider:
- Precompute and cache common route segment speeds to avoid expensive graph traversals on every request
- Use a routing engine that reads live segment speeds and computes shortest-time paths
- Update ETA predictions as the user travels by re-querying affected segments
- Fall back to historical speed patterns when live data is unavailable for a segment