Practice/Amazon/Design a price notification system
Design a price notification system
System DesignMust
Problem Statement
Design a price notification system that allows users to track Amazon product prices, subscribe to price drop alerts, and view historical pricing data for items in their wishlist. Think of CamelCamelCamel-style tracking with wishlist integration, price history charts, and multi-channel notifications (email, push).
Interviewers ask this because it touches ingestion at scale (consuming price feeds or scraping), event-driven pipelines, high-fanout notifications, and time-series storage. It tests your ability to separate hot vs cold paths, build reliable asynchronous workflows, and make sensible tradeoffs for latency, cost, and correctness. Expect to reason about scheduling data collection, deduplication, idempotent notifications, and efficient query patterns for history views.
Key Requirements
Functional
- Product tracking -- users add Amazon products (by URL or ASIN) to their watchlist and see the current price
- Price alert subscriptions -- users subscribe to price drop alerts with configurable rules (target price, percentage drop)
- Historical price data -- users view price history charts across configurable time windows (7 days, 30 days, 1 year)
- Notification preferences -- users choose and manage notification channels (email, push, SMS) and can pause or resume alerts
Non-Functional
- Scalability -- track tens of millions of products with hundreds of millions of subscribers; handle batch price updates for the entire catalog
- Reliability -- ensure no missed notifications when a price drops below a user's threshold; tolerate crawler or data source failures
- Latency -- price updates reflected in the system within 15 minutes of actual change; notifications delivered within 5 minutes of detection
- Consistency -- eventual consistency acceptable for price history and display; at-least-once delivery for notifications
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Price Collection and Ingestion Pipeline
Collecting prices for millions of products reliably is the foundation. Interviewers want to see how you schedule data collection, handle rate limits, and ensure data freshness.
Hints to consider:
- Design a job scheduler that prioritizes products based on subscriber count and price volatility
- Implement per-domain rate limiting with adaptive backoff to avoid being throttled or banned
- Use Kafka as a durable buffer between price collectors and downstream processing, enabling replay on failure
- Deduplicate price observations using (product_id, timestamp, price) to avoid false change detections from crawler retries
2. Change Detection and Notification Fanout
When a price drops, potentially millions of subscribers for popular products need to be notified. Interviewers test how you detect meaningful changes and fan out notifications without overwhelming downstream services.
Hints to consider:
- Compare new prices against the previous observation to detect changes, using a threshold to filter minor fluctuations
- For each price change, look up affected subscribers and their alert rules, evaluating each rule against the new price
- Fan out notification tasks to Kafka partitioned by user ID, consumed by notification workers that respect per-channel rate limits
- Implement idempotent notification delivery using (user_id, product_id, price_event_id) deduplication keys
3. Time-Series Storage for Price History
Users expect to view price trends over months or years for tracked products. Interviewers probe your storage choices and query efficiency.
Hints to consider:
- Store price observations in a time-series-friendly format: partition by product_id with sort key on timestamp for efficient range queries
- Implement tiered storage: full-resolution data for recent observations (30 days), downsampled daily averages for older data
- Cache frequently accessed price histories (popular products, recent time windows) in Redis with TTL eviction
- Pre-compute common chart data (min, max, average over standard windows) to reduce query-time computation
4. Crawl Scheduling and Prioritization
Not all products need the same update frequency. Interviewers want to see intelligent scheduling that balances freshness with resource costs.
Hints to consider:
- Assign products to priority tiers based on subscriber count, recent price volatility, and deal event schedules
- High-priority products (many subscribers, frequent changes) crawl every 15 minutes; low-priority every 6-24 hours
- Use a distributed job queue with rate-aware scheduling to spread load evenly across time
- Implement a feedback loop where detected price changes increase a product's crawl frequency temporarily
Suggested Approach
Step 1: Clarify Requirements
Start by confirming scope. Ask about the product catalog size, expected subscriber counts per product, and notification delivery SLAs. Clarify whether the system crawls Amazon directly or consumes a price feed API. Determine what notification channels are required and whether batch/digest notifications are acceptable. Establish price history retention requirements and chart granularity expectations.