Practice/PayPal/Design a price notification system
Design a price notification system
System DesignMust
Problem Statement
Design a system that lets users track product prices on e-commerce platforms like Amazon, set target price thresholds, and receive notifications when prices drop below their desired levels. The system should also store historical pricing data so users can view price trends over time and make informed purchasing decisions.
The central engineering challenge is building a reliable data ingestion pipeline that continuously monitors prices for millions of tracked products across multiple retailers, detects meaningful price changes, and fans out notifications to potentially millions of subscribers per popular product -- all without overwhelming external data sources, spamming users with duplicate alerts, or losing price observations during component failures. You need to reason about crawl scheduling, rate limiting against external APIs, change detection with deduplication, time-series storage for history, and high-fanout asynchronous notification delivery.
Consider that popular products like gaming consoles or flagship phones can have hundreds of thousands of price watchers. A single price drop on a viral product could trigger a notification storm that must be delivered promptly without degrading the rest of the system.
Key Requirements
Functional
- Product tracking -- Users can add products by URL or identifier to their watchlist and see the current price fetched from the retailer
- Price alert rules -- Users can set custom alert conditions such as absolute target price, percentage drop from current price, or all-time low
- Multi-channel notifications -- Deliver alerts via email, push notification, or in-app message based on user preference, with digest options for frequent watchers
- Historical price charts -- Display price history over configurable time ranges (7 days, 30 days, 1 year) with daily or hourly granularity
- Alert management -- Users can pause, resume, edit, or delete their alerts, and view a log of past notifications received
Non-Functional
- Scalability -- Support 50 million tracked products, 200 million active alerts, and burst notification fanout of 1 million messages per minute for viral price drops
- Freshness -- Detect price changes within 15 minutes for high-priority products and within 1 hour for standard products
- Reliability -- Guarantee at-least-once notification delivery with idempotent deduplication to prevent duplicate alerts
- Latency -- Serve current price and history queries in under 200ms at p99; notification delivery within 5 minutes of price change detection
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Crawl Scheduling and External Rate Management
Monitoring prices for millions of products requires fetching data from external sources that impose rate limits, throttle aggressive clients, and may change their page structures without warning. Interviewers want to see how you balance freshness requirements against external constraints.
Hints to consider:
- Prioritize crawl frequency by product popularity and alert urgency -- products with many watchers or imminent price targets get checked more often
- Implement per-domain rate limiters using token bucket algorithms in Redis to stay within retailer API or scraping limits
- Use adaptive backoff that reduces crawl frequency when a source starts returning errors or CAPTCHAs
- Distribute crawl jobs across a pool of workers with different IP addresses and rotate user agents to reduce blocking risk
2. Change Detection and Alert Evaluation
Each price observation must be compared against both the previous price (to detect changes) and all active alerts for that product (to trigger notifications). Doing this naively for products with hundreds of thousands of watchers creates hot keys and excessive database reads.
Hints to consider:
- Store the latest known price per product in Redis for O(1) comparison during ingestion -- only proceed to alert evaluation if the price actually changed
- Index active alerts by product ID in a partitioned data store so evaluating alerts for a single product is a bounded operation
- Use conditional evaluation to short-circuit: if the new price is higher than before, skip all "price drop" alerts without loading them
- Batch alert evaluations and publish matched alerts to a Kafka topic for asynchronous notification delivery
3. High-Fanout Notification Delivery
A single price drop on a popular product can match millions of alerts simultaneously. Delivering all those notifications quickly without overwhelming email providers or push notification services requires careful throttling and batching.
Hints to consider:
- Partition notification work by delivery channel and batch messages to providers (for example, send emails in batches of 1000 via SES or SendGrid)
- Use separate Kafka consumer groups for each notification channel so email delivery lag does not block push notifications
- Implement per-user rate limiting and deduplication windows to prevent spamming users who track many products on the same retailer
- Add circuit breakers around external notification providers to handle temporary outages without losing queued messages
4. Time-Series Storage for Price History
Users expect to view price trends spanning months or years, but storing every price observation at full granularity for 50 million products would consume enormous storage. Interviewers look for a tiered retention strategy.
Hints to consider:
- Store recent observations (last 7 days) at full hourly granularity in a hot store like DynamoDB or TimescaleDB
- Downsample older data to daily aggregates (min, max, average price) and store in a warm tier
- Archive data older than one year to object storage (S3) with on-demand retrieval for deep historical queries
- Pre-compute common chart data (30-day and 90-day views) and cache them to avoid repeated aggregation queries