Practice/Brex/Design CamelCamelCamel
Design CamelCamelCamel
System DesignMust
Problem Statement
Design a price tracking and alerting service for e-commerce products, similar to CamelCamelCamel. Users paste a product URL or use a browser extension to start monitoring an item, and the system periodically checks the price across retail platforms. Users can view historical price charts, set target-price alerts, and receive notifications when prices drop below their thresholds. Think of it as a specialized monitoring system that ingests external data under strict rate constraints, stores time-series pricing history, and fans out alerts to potentially millions of subscribers when popular products go on sale.
The primary engineering challenges are threefold. First, you must fetch prices from external APIs or web pages that impose rate limits and can change their formats without notice. Second, you need to store and query time-series data efficiently enough to render interactive charts with sub-second latency. Third, when a tracked product with thousands of watchers drops in price, you must evaluate all alert thresholds and deliver notifications quickly without overwhelming downstream services.
Interviewers ask this question to see how you handle third-party data ingestion at scale, design event-driven pipelines, choose appropriate storage for time-series workloads, and build reliable notification fan-out systems.
Key Requirements
Functional
- Product tracking -- users can add products by pasting a URL or using a browser extension, and immediately see the current price and basic statistics
- Price history visualization -- display historical price charts over selectable time ranges (7 days, 30 days, 1 year, all time) with smooth interactive rendering
- Alert configuration -- users set absolute target prices or percentage-drop thresholds and choose notification channels (email, push, SMS)
- Subscription management -- users can view all tracked products in a dashboard, edit alert thresholds, pause or resume tracking, and unsubscribe from individual products
Non-Functional
- Scalability -- support 50M tracked products, 100K price updates per minute, and 10M registered users with 1M daily active users
- Reliability -- 99.9% uptime for the alerting pipeline; zero missed alerts for legitimate price drops
- Latency -- price charts render in under 500ms; alerts delivered within 5 minutes of a confirmed price change
- Consistency -- eventual consistency acceptable for historical chart data; strong consistency required for alert threshold evaluation to avoid duplicate or missed notifications
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Price Ingestion and Rate-Aware Scheduling
The system depends on external data sources that impose strict API rate limits or may block aggressive scrapers. A naive approach of polling every product at a fixed interval will either exhaust quotas or leave data stale for popular items.
Hints to consider:
- Design a priority-based scheduler where products near alert thresholds or with many watchers are checked more frequently than dormant ones
- Implement distributed rate limiting using token buckets partitioned by API provider, so workers across multiple nodes collectively respect quotas
- Use adaptive scheduling that increases check frequency when recent price volatility is detected for a product
- Partition products across worker pools by provider and region to isolate failures and respect per-endpoint limits independently
2. Time-Series Storage and Chart Query Performance
Price history is append-heavy and read-heavy simultaneously. Users expect charts to load instantly even for products with years of daily data points. Poor data modeling leads to expensive full-range scans.
Hints to consider:
- Partition price data by product ID and time range to enable efficient range scans without touching unrelated products
- Pre-compute rollups at multiple granularities (hourly raw, daily summary, weekly summary) so long-range chart queries read fewer rows
- Use a columnar or time-series-optimized database (TimescaleDB, or PostgreSQL with time-based partitioning) for the raw price store
- Cache rendered chart data in Redis with TTLs aligned to the update frequency -- a 1-hour TTL for products checked hourly avoids redundant queries
3. Alert Evaluation and Notification Fan-Out
When a popular product drops in price, the system must evaluate thousands of user thresholds and dispatch notifications without scanning the entire subscriptions table. Duplicate alerts from retry storms or rapid price fluctuations erode user trust.
Hints to consider:
- Maintain an inverted index in Redis mapping each product ID to a set of subscriber IDs for O(1) lookup when a price change event arrives
- Decouple alert detection from notification delivery using a message queue -- the alert engine publishes notification tasks, and downstream workers handle email, push, and SMS independently
- Use idempotency keys (hash of user ID, product ID, price, and truncated timestamp) stored in Redis with a 24-hour TTL to prevent duplicate notifications
- Apply per-user rate limiting so rapid price fluctuations during flash sales do not spam users with multiple alerts per hour
4. Data Quality and False Alert Prevention
Scraped prices can be wrong due to currency mismatches, page format changes, temporary glitches, or promotional pricing that reverts within minutes. Sending alerts for false drops damages credibility.
Hints to consider:
- Normalize all prices to a canonical currency and unit before storage and comparison
- Flag suspicious changes (drops exceeding 80% or prices of zero) for verification through a secondary source or manual review queue
- Implement a confirmation window where a price must remain at the new level for a configurable period (e.g., 15 minutes) before triggering alerts
- Store raw API responses or page snapshots for audit trails and re-parsing when provider formats change
Suggested Approach
Step 1: Clarify Requirements
Confirm scope with your interviewer: how many products and e-commerce platforms are in scope, whether prices come from official APIs or require web scraping, acceptable staleness for price data (real-time vs. hourly), notification delivery guarantees (at-least-once vs. exactly-once), and whether the browser extension is in scope for this design. Ask about peak load scenarios -- what happens during Black Friday when millions of products change price simultaneously.