Practice/Meta/Design an e-commerce platform
Design an e-commerce platform
System DesignMust
Problem Statement
Design a high-traffic flash sale platform similar to Gilt or Rue La La, where limited quantities of products are released at specific times for brief selling windows. The system must handle massive spikes in concurrent users trying to purchase inventory that sells out in minutes or seconds, while ensuring no overselling occurs and providing a fair, functional user experience during extreme load.
The core challenge lies in managing inventory consistency under extreme write contention while maintaining acceptable latency for browsing and checkout. During a typical flash sale event, you might see 100,000+ concurrent users attempting to purchase 500 units of a single product within the first 30 seconds. Your design must prevent overselling, handle graceful degradation when resources are exhausted, and provide clear feedback to users about inventory status and purchase success or failure.
Key Requirements
Functional
- Event management -- Administrators can schedule flash sale events with start times, end times, and limited inventory allocations for specific products
- Pre-event registration -- Users can browse upcoming events and register for notifications before the sale begins
- Real-time inventory visibility -- Users see accurate, near-real-time stock availability during active sales without stale data leading to false hope
- Purchase flow -- Users can claim inventory, complete checkout with payment processing, and receive immediate confirmation or clear rejection if inventory is exhausted
- Order management -- Users can view their purchase history, and the system supports order cancellation with inventory return within a short window
Non-Functional
- Scalability -- Support 500,000+ concurrent users during peak events with 10,000+ purchase attempts per second
- Reliability -- Zero overselling even under race conditions; maintain system availability with graceful degradation when capacity limits are reached
- Latency -- Product page loads under 200ms; inventory checks under 100ms; checkout completion under 2 seconds for successful purchases
- Consistency -- Strong consistency for inventory decrements and reservations; eventual consistency acceptable for ancillary data like view counts
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Inventory Reservation and Consistency
Flash sales create an extreme write hotspot on a single inventory counter. Interviewers want to see how you prevent overselling while maintaining throughput, and whether you understand the tradeoffs between pessimistic locking, optimistic concurrency control, and reservation-based approaches.
Hints to consider:
- Consider separating inventory claim from payment processing with time-limited reservations that automatically expire
- Evaluate whether optimistic locking with version numbers can provide sufficient throughput or if you need sharding strategies
- Discuss how to handle reservation expiration and reclaiming inventory for fairness
- Think about queue-based approaches that serialize purchase attempts to prevent thundering herd on the database
2. Scaling Read Traffic and Pre-Event Load
Before a flash sale starts, thousands of users refresh the page waiting for the sale to begin. Interviewers look for strategies to offload this read-heavy traffic from your primary database while ensuring users see the sale start promptly when it begins.
Hints to consider:
- Use aggressive caching with short TTLs and cache warming for product detail pages
- Consider WebSocket connections or server-sent events to push sale start notifications rather than polling
- Discuss CDN strategies for static assets and product images to reduce origin load
- Plan for cache invalidation strategies when the sale transitions from "upcoming" to "active" state
3. Queue Management and Fair Access
When demand far exceeds supply, you need mechanisms to handle the overflow gracefully. Interviewers want to understand how you provide a fair user experience and prevent system collapse when 100,000 users compete for 500 items.
Hints to consider:
- Implement a virtual waiting room that admits users to the sale at a controlled rate
- Use randomized queue positions or lottery systems to provide fairness rather than pure speed advantages
- Consider rate limiting per user to prevent bot attacks and script-based purchases
- Discuss how to communicate queue position and estimated wait time to users
4. Payment Processing and Transaction Coordination
Successful inventory claims must be converted to paid orders, requiring coordination with external payment providers. Interviewers probe how you handle partial failures, retries, and ensure exactly-once payment semantics while holding inventory.
Hints to consider:
- Model checkout as a multi-step saga with compensation logic for each step
- Implement idempotency keys for payment authorization to handle retries safely
- Decide on hold duration for unpaid reservations and what happens on payment timeout
- Consider pre-authorization of payment methods before the sale to reduce checkout latency during the event