Practice/Oracle/Design a Real-time Traffic Monitoring System
Design a Real-time Traffic Monitoring System
System DesignMust
Problem Statement
Design a system that allows users to view real-time traffic conditions, congestion levels, and road status across a city by collecting and processing data from multiple sources like GPS devices, traffic cameras, and sensors. Users pan and zoom a map, watch color-coded roads update every few seconds, and get route ETAs that adapt to current conditions.
The system stresses end-to-end thinking across high-velocity data ingestion, stream processing, geospatial indexing, fan-out to many clients, and correctness under out-of-order and noisy data. You must balance latency vs accuracy, design for hotspots during rush hour, and plan for resilience and data quality. Strong candidates can scope the MVP, pick scalable patterns, and articulate tradeoffs and SLAs.
Key Requirements
Functional
- Live traffic map -- users view a live city map showing traffic speed, congestion levels, and incidents in near real time
- Route ETA -- users search a route and receive current ETA and delay estimates based on live conditions
- Road events -- users see road events such as accidents, closures, and construction with location and recency
- Viewport updates -- users receive live updates for their current viewport or selected area without refreshing
Non-Functional
- Scalability -- handle millions of GPS pings per minute from vehicles, sensors, and mobile devices during peak rush hour
- Reliability -- maintain service during partial failures; degrade gracefully by serving slightly stale data rather than failing
- Latency -- reflect traffic changes on the map within 10-30 seconds; serve route ETAs in under 500ms
- Consistency -- eventual consistency is acceptable; traffic conditions should converge to accurate state within seconds
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. High-Volume Data Ingestion
Interviewers want to see how you handle millions of GPS pings and sensor readings per minute without losing data or creating bottlenecks.
Hints to consider:
- Use Kafka as a durable, partitioned ingestion layer that absorbs burst traffic during rush hour
- Partition by geographic region or road segment ID to enable parallel processing and maintain locality
- Implement idempotent consumers to handle duplicate GPS pings from retries or overlapping data sources
- Apply backpressure mechanisms to protect downstream processors during unexpected traffic spikes
2. Stream Processing and Aggregation
Raw GPS pings must be transformed into meaningful traffic metrics (speed, congestion level) per road segment. Interviewers probe your streaming architecture.
Hints to consider:
- Use a stream processor (Flink) with event-time windows and watermarks for handling out-of-order GPS data
- Map snap GPS coordinates to road segments using a geospatial index, then compute per-segment average speeds
- Apply windowed aggregation (e.g., 30-second tumbling windows) to smooth noisy data and detect trends
- Implement outlier detection to filter erroneous GPS readings (teleporting vehicles, stationary pings)
3. Real-Time Map Updates to Clients
Interviewers want to see how you deliver traffic updates to potentially millions of concurrent map viewers efficiently.
Hints to consider:
- Use WebSockets or Server-Sent Events to push updates to connected clients
- Tile the map into geographic cells and subscribe clients only to cells in their current viewport
- Coalesce updates: send delta changes rather than full state to reduce bandwidth
- Cache the latest per-tile traffic state in Redis for fast initial load when users open the map
4. Data Quality and Reliability
Traffic data from real-world sensors is noisy, late, and sometimes incorrect. Interviewers expect strategies for maintaining accuracy.
Hints to consider:
- Handle late-arriving GPS data with allowed lateness windows in the stream processor
- Use median or trimmed mean instead of simple average to reduce impact of outlier readings
- Mark road segments as "stale" when sensor data stops arriving and show last-known-good data with timestamps
- Implement fallback to historical traffic patterns when real-time data is unavailable for a segment