Practice/Amazon/Design a price notification system

Design a price notification system

System DesignMust

Problem Statement

Design a price notification system that allows users to track Amazon product prices, subscribe to price drop alerts, and view historical pricing data for items in their wishlist. Think of CamelCamelCamel-style tracking with wishlist integration, price history charts, and multi-channel notifications (email, push).

Interviewers ask this because it touches ingestion at scale (consuming price feeds or scraping), event-driven pipelines, high-fanout notifications, and time-series storage. It tests your ability to separate hot vs cold paths, build reliable asynchronous workflows, and make sensible tradeoffs for latency, cost, and correctness. Expect to reason about scheduling data collection, deduplication, idempotent notifications, and efficient query patterns for history views.

Key Requirements

Functional

Product tracking -- users add Amazon products (by URL or ASIN) to their watchlist and see the current price
Price alert subscriptions -- users subscribe to price drop alerts with configurable rules (target price, percentage drop)
Historical price data -- users view price history charts across configurable time windows (7 days, 30 days, 1 year)
Notification preferences -- users choose and manage notification channels (email, push, SMS) and can pause or resume alerts

Non-Functional

Scalability -- track tens of millions of products with hundreds of millions of subscribers; handle batch price updates for the entire catalog
Reliability -- ensure no missed notifications when a price drops below a user's threshold; tolerate crawler or data source failures
Latency -- price updates reflected in the system within 15 minutes of actual change; notifications delivered within 5 minutes of detection
Consistency -- eventual consistency acceptable for price history and display; at-least-once delivery for notifications

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Price Collection and Ingestion Pipeline

Collecting prices for millions of products reliably is the foundation. Interviewers want to see how you schedule data collection, handle rate limits, and ensure data freshness.

Hints to consider:

Design a job scheduler that prioritizes products based on subscriber count and price volatility
Implement per-domain rate limiting with adaptive backoff to avoid being throttled or banned
Use Kafka as a durable buffer between price collectors and downstream processing, enabling replay on failure
Deduplicate price observations using (product_id, timestamp, price) to avoid false change detections from crawler retries

2. Change Detection and Notification Fanout

When a price drops, potentially millions of subscribers for popular products need to be notified. Interviewers test how you detect meaningful changes and fan out notifications without overwhelming downstream services.

Hints to consider:

Compare new prices against the previous observation to detect changes, using a threshold to filter minor fluctuations
For each price change, look up affected subscribers and their alert rules, evaluating each rule against the new price
Fan out notification tasks to Kafka partitioned by user ID, consumed by notification workers that respect per-channel rate limits
Implement idempotent notification delivery using (user_id, product_id, price_event_id) deduplication keys

3. Time-Series Storage for Price History

Users expect to view price trends over months or years for tracked products. Interviewers probe your storage choices and query efficiency.

Hints to consider:

Store price observations in a time-series-friendly format: partition by product_id with sort key on timestamp for efficient range queries
Implement tiered storage: full-resolution data for recent observations (30 days), downsampled daily averages for older data
Cache frequently accessed price histories (popular products, recent time windows) in Redis with TTL eviction
Pre-compute common chart data (min, max, average over standard windows) to reduce query-time computation

4. Crawl Scheduling and Prioritization

Not all products need the same update frequency. Interviewers want to see intelligent scheduling that balances freshness with resource costs.

Hints to consider:

Assign products to priority tiers based on subscriber count, recent price volatility, and deal event schedules
High-priority products (many subscribers, frequent changes) crawl every 15 minutes; low-priority every 6-24 hours
Use a distributed job queue with rate-aware scheduling to spread load evenly across time
Implement a feedback loop where detected price changes increase a product's crawl frequency temporarily

Suggested Approach

Step 1: Clarify Requirements

Start by confirming scope. Ask about the product catalog size, expected subscriber counts per product, and notification delivery SLAs. Clarify whether the system crawls Amazon directly or consumes a price feed API. Determine what notification channels are required and whether batch/digest notifications are acceptable. Establish price history retention requirements and chart granularity expectations.

Practice/Amazon/Design a price notification system

Design a price notification system

System DesignMust

Problem Statement

Key Requirements

Functional

Product tracking -- users add Amazon products (by URL or ASIN) to their watchlist and see the current price
Price alert subscriptions -- users subscribe to price drop alerts with configurable rules (target price, percentage drop)
Historical price data -- users view price history charts across configurable time windows (7 days, 30 days, 1 year)
Notification preferences -- users choose and manage notification channels (email, push, SMS) and can pause or resume alerts

Non-Functional

Scalability -- track tens of millions of products with hundreds of millions of subscribers; handle batch price updates for the entire catalog
Reliability -- ensure no missed notifications when a price drops below a user's threshold; tolerate crawler or data source failures
Latency -- price updates reflected in the system within 15 minutes of actual change; notifications delivered within 5 minutes of detection
Consistency -- eventual consistency acceptable for price history and display; at-least-once delivery for notifications

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Price Collection and Ingestion Pipeline

Collecting prices for millions of products reliably is the foundation. Interviewers want to see how you schedule data collection, handle rate limits, and ensure data freshness.

Hints to consider:

Design a job scheduler that prioritizes products based on subscriber count and price volatility
Implement per-domain rate limiting with adaptive backoff to avoid being throttled or banned
Use Kafka as a durable buffer between price collectors and downstream processing, enabling replay on failure
Deduplicate price observations using (product_id, timestamp, price) to avoid false change detections from crawler retries

2. Change Detection and Notification Fanout

Hints to consider:

Compare new prices against the previous observation to detect changes, using a threshold to filter minor fluctuations
For each price change, look up affected subscribers and their alert rules, evaluating each rule against the new price
Fan out notification tasks to Kafka partitioned by user ID, consumed by notification workers that respect per-channel rate limits
Implement idempotent notification delivery using (user_id, product_id, price_event_id) deduplication keys

3. Time-Series Storage for Price History

Users expect to view price trends over months or years for tracked products. Interviewers probe your storage choices and query efficiency.

Hints to consider:

Store price observations in a time-series-friendly format: partition by product_id with sort key on timestamp for efficient range queries
Implement tiered storage: full-resolution data for recent observations (30 days), downsampled daily averages for older data
Cache frequently accessed price histories (popular products, recent time windows) in Redis with TTL eviction
Pre-compute common chart data (min, max, average over standard windows) to reduce query-time computation

4. Crawl Scheduling and Prioritization

Not all products need the same update frequency. Interviewers want to see intelligent scheduling that balances freshness with resource costs.

Hints to consider:

Assign products to priority tiers based on subscriber count, recent price volatility, and deal event schedules
High-priority products (many subscribers, frequent changes) crawl every 15 minutes; low-priority every 6-24 hours
Use a distributed job queue with rate-aware scheduling to spread load evenly across time
Implement a feedback loop where detected price changes increase a product's crawl frequency temporarily