Practice/Uber/Design a Temperature Monitoring System

Design a Temperature Monitoring System

System DesignMust

Problem Statement

Design a temperature monitoring system that displays current and historical temperature data across a 10 million km-squared area with sensors placed every 10 km-squared. Users should be able to view real-time temperatures on a map and query past temperature records for any location in the coverage area.

This implies around one million sensor endpoints continuously streaming measurements. The system must handle sustained high write throughput, low-latency reads for current conditions, real-time map updates, and efficient historical range queries. The core challenges revolve around time-series ingestion at scale, geospatial indexing, and delivering a responsive UI via caching and fan-out while keeping storage under control.

Interviewers at Uber ask this to test whether you can design a geo-aware, time-series system that handles high write throughput, low-latency reads, and real-time updates. Strong answers show clear data modeling for time-range queries, backpressure management, and a practical caching and fan-out strategy.

Key Requirements

Functional

Real-time temperature view -- users view the current temperature for any location in the coverage area in near real time
Historical queries -- users query historical temperatures for a specific location over a chosen time range
Sensor health -- users see when a sensor was last updated and whether its data is currently healthy or delayed
Regional aggregation -- users view aggregated temperatures over a region (map tiles or bounding boxes) for a time window

Non-Functional

Scalability -- handle approximately 1 million sensors reporting at intervals of 1 minute or less, producing 16K+ writes per second sustained
Reliability -- tolerate sensor and node failures without data loss; graceful degradation if prediction services fail
Latency -- current temperature reads under 100 ms P99; historical range queries under 500 ms for typical time ranges
Consistency -- eventual consistency acceptable for map displays; sensor health status should reflect actual state within seconds

What Interviewers Focus On

Based on real interview experiences at Uber, Amazon, and Oracle, these are the areas interviewers probe most deeply:

1. Ingestion Pipeline and Write Scaling

With roughly one million sensors, even moderate sampling rates create intense write loads. Interviewers want to see how you handle sustained ingestion without creating hot shards.

Hints to consider:

Ingest sensor readings into Kafka partitioned by sensor_id to spread load and maintain per-sensor ordering
Use idempotent consumers with event timestamps to handle duplicates and out-of-order arrivals from unreliable sensor networks
Batch writes to the time-series store (flush every 5 seconds per partition) to reduce write amplification
Apply backpressure at the ingestion layer to shed load gracefully during traffic spikes

2. Time-Series Storage and Query Patterns

Historical queries require efficient range scans over potentially years of data. Interviewers evaluate your storage model and retention strategy.

Hints to consider:

Use a time-series database (Cassandra with time-bucketed partitions, or a purpose-built TSDB like TimescaleDB) with composite keys of (sensor_id, time_bucket)
Implement time-bucketed partitioning (hourly or daily) to enable efficient range scans and partition-level deletion for retention
Pre-compute rollups (hourly and daily averages) asynchronously to speed up longer-range queries
Apply TTL-based retention policies: raw data for 30 days, hourly rollups for 1 year, daily rollups indefinitely

3. Real-Time Map Updates

Users viewing the temperature map need fresh data without polling every second. Interviewers probe how you push updates efficiently to many concurrent viewers.

Hints to consider:

Store the latest reading per sensor in Redis with a short TTL for sub-millisecond current-value lookups
Use a tile-based approach: aggregate sensors into map tiles and push tile-level updates to subscribed clients via WebSockets or SSE
Coalesce rapid updates within the same tile into periodic refreshes (every 10-30 seconds) to reduce fan-out volume
Use Redis Pub/Sub or a similar pub/sub layer to route tile updates to the correct WebSocket servers

4. Geospatial Indexing and Regional Aggregation

Users want to see temperature patterns across regions, not just individual sensors. Interviewers look for spatial query support.

Hints to consider:

Index sensors by geohash or S2 cell ID to enable efficient spatial lookups (find all sensors in a bounding box)
Pre-aggregate tile-level statistics (min, max, average temperature) during stream processing for common zoom levels
Cache tile aggregates in Redis with TTLs matching the update frequency
Support drill-down: show tile-level summary at low zoom, individual sensor data at high zoom

Practice/Uber/Design a Temperature Monitoring System

Design a Temperature Monitoring System

System DesignMust

Problem Statement

Key Requirements

Functional

Real-time temperature view -- users view the current temperature for any location in the coverage area in near real time
Historical queries -- users query historical temperatures for a specific location over a chosen time range
Sensor health -- users see when a sensor was last updated and whether its data is currently healthy or delayed
Regional aggregation -- users view aggregated temperatures over a region (map tiles or bounding boxes) for a time window

Non-Functional

Scalability -- handle approximately 1 million sensors reporting at intervals of 1 minute or less, producing 16K+ writes per second sustained
Reliability -- tolerate sensor and node failures without data loss; graceful degradation if prediction services fail
Latency -- current temperature reads under 100 ms P99; historical range queries under 500 ms for typical time ranges
Consistency -- eventual consistency acceptable for map displays; sensor health status should reflect actual state within seconds

What Interviewers Focus On

Based on real interview experiences at Uber, Amazon, and Oracle, these are the areas interviewers probe most deeply:

1. Ingestion Pipeline and Write Scaling

With roughly one million sensors, even moderate sampling rates create intense write loads. Interviewers want to see how you handle sustained ingestion without creating hot shards.

Hints to consider:

Ingest sensor readings into Kafka partitioned by sensor_id to spread load and maintain per-sensor ordering
Use idempotent consumers with event timestamps to handle duplicates and out-of-order arrivals from unreliable sensor networks
Batch writes to the time-series store (flush every 5 seconds per partition) to reduce write amplification
Apply backpressure at the ingestion layer to shed load gracefully during traffic spikes

2. Time-Series Storage and Query Patterns

Historical queries require efficient range scans over potentially years of data. Interviewers evaluate your storage model and retention strategy.

Hints to consider:

Use a time-series database (Cassandra with time-bucketed partitions, or a purpose-built TSDB like TimescaleDB) with composite keys of (sensor_id, time_bucket)
Implement time-bucketed partitioning (hourly or daily) to enable efficient range scans and partition-level deletion for retention
Pre-compute rollups (hourly and daily averages) asynchronously to speed up longer-range queries
Apply TTL-based retention policies: raw data for 30 days, hourly rollups for 1 year, daily rollups indefinitely

3. Real-Time Map Updates

Users viewing the temperature map need fresh data without polling every second. Interviewers probe how you push updates efficiently to many concurrent viewers.

Hints to consider:

Store the latest reading per sensor in Redis with a short TTL for sub-millisecond current-value lookups
Use a tile-based approach: aggregate sensors into map tiles and push tile-level updates to subscribed clients via WebSockets or SSE
Coalesce rapid updates within the same tile into periodic refreshes (every 10-30 seconds) to reduce fan-out volume
Use Redis Pub/Sub or a similar pub/sub layer to route tile updates to the correct WebSocket servers

4. Geospatial Indexing and Regional Aggregation

Users want to see temperature patterns across regions, not just individual sensors. Interviewers look for spatial query support.

Hints to consider:

Index sensors by geohash or S2 cell ID to enable efficient spatial lookups (find all sensors in a bounding box)
Pre-aggregate tile-level statistics (min, max, average temperature) during stream processing for common zoom levels
Cache tile aggregates in Redis with TTLs matching the update frequency
Support drill-down: show tile-level summary at low zoom, individual sensor data at high zoom