Practice/Oracle/Design a Temperature Monitoring System

Design a Temperature Monitoring System

System DesignMust

Problem Statement

Design a temperature monitoring system that displays current and historical temperature data across a 10 million km-squared area with sensors placed every 10 km-squared. Users should be able to view real-time temperatures and query past temperature records for any location in the coverage area.

This implies roughly one million sensor endpoints continuously streaming measurements. The system is a real-time and historical data platform where users view live temperatures on a map and query past readings for any point in a large region. Interviewers ask this to test whether you can design a geo-aware, time-series system that handles sustained high write throughput, low-latency reads, and real-time updates while keeping storage under control.

Key Requirements

Functional

Current temperature -- users can view the current temperature for any location in the coverage area in near real time
Historical queries -- users can query historical temperatures for a specific location over a chosen time range
Sensor health -- users can see when a sensor was last updated and whether its data is currently healthy or delayed
Regional aggregation -- users can view aggregated temperatures over a region (map tiles or bounding boxes) for a time window

Non-Functional

Scalability -- support one million sensors, each reporting every minute (~16,700 writes/second sustained), with higher sampling rates possible
Reliability -- tolerate individual sensor or processing node failures without data loss; backfill late data
Latency -- serve current temperature lookups in under 100ms; historical range queries in under 500ms for reasonable time ranges
Consistency -- eventual consistency for map updates (seconds of delay acceptable); strong consistency for historical query results

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. High-Throughput Ingestion Pipeline

With roughly one million sensors, even at one reading per minute, you face ~16,700 writes per second. Interviewers want to see a durable, scalable write path.

Hints to consider:

Use Kafka as the ingestion layer, partitioned by sensor_id or geographic region for balanced load distribution
Implement idempotent consumers to handle duplicate sensor readings from retries or overlapping data feeds
Buffer and batch writes to the time-series store to reduce write amplification
Apply backpressure mechanisms to handle bursts when many sensors report simultaneously

2. Time-Series Data Modeling and Storage

Temperature readings are classic time-series data. Interviewers expect you to choose appropriate storage strategies and data models.

Hints to consider:

Use a time-series database (Cassandra, TimescaleDB, or InfluxDB) with partition keys combining sensor_id and time buckets
Design partition keys to spread writes evenly: sensor_id as partition key, timestamp as clustering column
Implement data retention policies: keep raw data for 30 days, then downsample to hourly/daily averages for long-term storage
Use TTL-based automatic expiration for raw data to manage storage growth

3. Real-Time Map Display and Updates

Interviewers want to see how you serve live temperature data to users viewing a map without overloading storage.

Hints to consider:

Cache the latest reading per sensor in Redis for sub-millisecond lookups
Pre-compute temperature aggregates per map tile at multiple zoom levels
Use WebSockets or Server-Sent Events to push temperature updates to clients viewing specific map regions
Subscribe clients only to tiles in their current viewport to minimize unnecessary data transfer

4. Data Quality and Late-Arriving Data

Real sensors produce noisy, late, and sometimes incorrect data. Interviewers expect strategies for handling these issues.

Hints to consider:

Validate readings within expected ranges and flag or discard obvious outliers
Handle late-arriving data by accepting readings with a lateness window and updating affected aggregations
Mark sensors as "stale" when readings stop arriving and display last-known-good data with a warning indicator
Implement redundant sensors in critical areas to cross-validate readings

Practice/Oracle/Design a Temperature Monitoring System

Design a Temperature Monitoring System

System DesignMust

Problem Statement

Key Requirements

Functional

Current temperature -- users can view the current temperature for any location in the coverage area in near real time
Historical queries -- users can query historical temperatures for a specific location over a chosen time range
Sensor health -- users can see when a sensor was last updated and whether its data is currently healthy or delayed
Regional aggregation -- users can view aggregated temperatures over a region (map tiles or bounding boxes) for a time window

Non-Functional

Scalability -- support one million sensors, each reporting every minute (~16,700 writes/second sustained), with higher sampling rates possible
Reliability -- tolerate individual sensor or processing node failures without data loss; backfill late data
Latency -- serve current temperature lookups in under 100ms; historical range queries in under 500ms for reasonable time ranges
Consistency -- eventual consistency for map updates (seconds of delay acceptable); strong consistency for historical query results

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. High-Throughput Ingestion Pipeline

With roughly one million sensors, even at one reading per minute, you face ~16,700 writes per second. Interviewers want to see a durable, scalable write path.

Hints to consider:

Use Kafka as the ingestion layer, partitioned by sensor_id or geographic region for balanced load distribution
Implement idempotent consumers to handle duplicate sensor readings from retries or overlapping data feeds
Buffer and batch writes to the time-series store to reduce write amplification
Apply backpressure mechanisms to handle bursts when many sensors report simultaneously

2. Time-Series Data Modeling and Storage

Temperature readings are classic time-series data. Interviewers expect you to choose appropriate storage strategies and data models.

Hints to consider:

Use a time-series database (Cassandra, TimescaleDB, or InfluxDB) with partition keys combining sensor_id and time buckets
Design partition keys to spread writes evenly: sensor_id as partition key, timestamp as clustering column
Implement data retention policies: keep raw data for 30 days, then downsample to hourly/daily averages for long-term storage
Use TTL-based automatic expiration for raw data to manage storage growth

3. Real-Time Map Display and Updates

Interviewers want to see how you serve live temperature data to users viewing a map without overloading storage.

Hints to consider:

Cache the latest reading per sensor in Redis for sub-millisecond lookups
Pre-compute temperature aggregates per map tile at multiple zoom levels
Use WebSockets or Server-Sent Events to push temperature updates to clients viewing specific map regions
Subscribe clients only to tiles in their current viewport to minimize unnecessary data transfer

4. Data Quality and Late-Arriving Data

Real sensors produce noisy, late, and sometimes incorrect data. Interviewers expect strategies for handling these issues.

Hints to consider:

Validate readings within expected ranges and flag or discard obvious outliers
Handle late-arriving data by accepting readings with a lateness window and updating affected aggregations
Mark sensors as "stale" when readings stop arriving and display last-known-good data with a warning indicator
Implement redundant sensors in critical areas to cross-validate readings