Design an online auction system where users can list items for sale with time-bound auctions, place incrementally higher bids, view the current leading bidder in real time, and receive notifications when auctions close. The winning bidder then proceeds to a checkout flow to complete the purchase.
Think of a platform similar to eBay or a hypothetical auction feature built into a large social app like Instagram. Creators post items with photos, set an auction window and starting price, and followers compete by submitting bids. The system must handle everything from casual listings with a handful of bidders to celebrity auctions attracting tens of thousands of simultaneous participants. Real interview reports from TikTok indicate interviewers may start with modest scale (around 5,000 requests per second) before pushing you to scale to 50,000 or more during deep dives.
The fundamental challenge is maintaining strict correctness around bid acceptance and winner determination while delivering low-latency updates to all connected viewers. You need to reason about contention on hot auction records, push-based real-time fanout, durable end-of-auction scheduling, and reliable settlement workflows.
Based on real interview experiences at TikTok and related companies, these are the areas interviewers probe most deeply:
When thousands of bidders converge on a single auction record simultaneously, the system faces a classic write hotspot. Interviewers want to see how you coordinate concurrent writes without sacrificing throughput or correctness.
Hints to consider:
Bidders and passive viewers both expect to see bid changes reflected instantly. Naively broadcasting every update to every client creates unsustainable load on high-profile auctions. Interviewers probe your fanout strategy.
Hints to consider:
Auctions must close precisely at their scheduled end time, even if the system experiences restarts or partial outages. The winner determination and checkout initiation must happen exactly once. Interviewers look for durable timer design.
Hints to consider:
Interviewers at TikTok have been known to start with 5,000 RPS and then push to 50,000 during deep dives. You need to explain how the architecture adapts without a redesign.
Hints to consider:
Start by confirming the scope of the auction system. Ask whether auctions are simple English auctions (ascending bids) or if other formats (Dutch, sealed-bid) are needed. Establish scale: how many concurrent auctions, what is the peak bidders-per-auction, and what is the overall bid rate? Clarify whether items have reserve prices, whether the platform takes a commission, and whether there are anti-sniping rules. Confirm latency expectations for bid acceptance and real-time updates. Ask whether the interviewer wants you to include the checkout and payment flow or just the auction mechanics.
Sketch a system with five major components: an API gateway handling client connections, a bid ingestion service fronted by Kafka (partitioned by auction ID), a bid processor that validates bids against the authoritative auction record in PostgreSQL, a real-time notification layer using WebSocket servers subscribed to a Redis Pub/Sub channel per auction, and a scheduler service for auction close events. Store auction metadata and bid history in PostgreSQL, cache the current highest bid and auction state in Redis for fast reads, and use Kafka to decouple bid ingestion from processing and side effects. Object storage (S3 or equivalent) holds item images and media.
Walk through the critical path: a bid arrives at the API gateway, is published to the Kafka topic partitioned by auction ID, and consumed by the bid processor. The processor reads the current auction aggregate from PostgreSQL (highest bid, version, status), validates the new bid (higher than current + minimum increment, auction still open, user has valid payment method), and performs an atomic compare-and-swap update on the auction row. If the CAS succeeds, the bid is accepted: publish an update event to Redis Pub/Sub for real-time fanout, write the bid to the history table, and return success. If the CAS fails (version mismatch), another bid was accepted first: return rejection with the latest bid details. Discuss how this design avoids locks while guaranteeing correctness, and how partitioning by auction ID ensures ordered processing.
Cover the auction close workflow: the scheduler fires at the end time, the close handler checks auction state, records the winner, sends push notifications to the winner and losing bidders, and triggers the checkout flow. Discuss soft-close logic for anti-sniping. Address monitoring: track bid acceptance latency, rejection rates, WebSocket connection counts, and Kafka consumer lag. Cover failure modes: what happens if the bid processor crashes mid-transaction (Kafka reprocesses the message, CAS ensures idempotency), if WebSocket servers go down (clients reconnect and receive the latest state), or if the scheduler misses a close event (periodic sweeper picks up overdue auctions). Discuss security: rate limiting per user to prevent bid spam, authentication and authorization for bid placement, and fraud detection for collusion patterns.
Explore these resources for deeper coverage of the patterns used in this design: