Practice/Amazon/Design a Restaurant Booking Service
Design a Restaurant Booking Service
System DesignMust
Problem Statement
Design a restaurant booking system that allows users to search for restaurants by name, cuisine, or location, check real-time table availability, make reservations, and manage their bookings. The system must handle peak dining hours when thousands of users simultaneously compete for limited tables at popular restaurants, ensure no double-bookings, and support temporary holds during the checkout process.
At Amazon, interviewers ask this to test your ability to manage contended inventory with time-sensitive holds, orchestrate multi-step booking workflows, and maintain data consistency across distributed components. Expect to discuss reservation atomicity, search performance, and graceful degradation under load.
Key Requirements
Functional
- Restaurant discovery -- users search restaurants by cuisine, location, price range, and availability for specific date/time/party-size combinations
- Real-time availability -- users view open tables for their desired party size and time slot with instant updates as inventory changes
- Reservation with temporary holds -- users select a time slot that is temporarily reserved (10-15 minutes) while they complete the booking
- Booking management -- users can confirm, modify, or cancel reservations with appropriate notice periods and receive confirmations
Non-Functional
- Scalability -- support 50,000+ restaurants, 100K concurrent users, and 1M+ reservations per day during peak periods
- Reliability -- guarantee no double-bookings under any failure scenario; maintain 99.9% uptime with graceful degradation
- Latency -- availability searches under 300ms, booking confirmations under 500ms
- Consistency -- strong consistency for reservation writes; eventual consistency acceptable for search results and availability displays
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Inventory Contention and Double-Booking Prevention
Multiple users simultaneously requesting the same time slot at a popular restaurant creates severe contention. This is the hardest problem and the core of most interview discussions.
Hints to consider:
- Use optimistic locking with version numbers: read available count with version, decrement in memory, conditionally write back checking version matches
- Model table inventory as time-slot tokens per restaurant with atomic decrement operations
- Implement temporary holds: when a user selects a slot, atomically decrement available count and create a hold record with TTL
- Handle conflicts gracefully: if the write fails (another user got the slot), retry with a different slot or return "unavailable"
2. Temporary Hold Mechanism
Users need time to complete their reservation without losing their selection. But holds cannot persist indefinitely or they starve inventory.
Hints to consider:
- Create hold records in Redis with TTLs that automatically expire, releasing inventory back to the available pool
- Use a background job as a safety net to clean up expired holds in case TTL mechanisms fail
- Handle the edge case where payment processing extends beyond the hold window with grace periods
- Make hold creation idempotent using (user_id, restaurant_id, time_slot) as a deduplication key
3. Search and Availability Display
Showing accurate real-time availability across thousands of restaurants is expensive. Stale data frustrates users who encounter "unavailable" errors at checkout.
Hints to consider:
- Cache availability snapshots per restaurant-date combination in Redis with 2-minute TTLs
- Use change-data-capture from the reservation database to invalidate caches when bookings change
- Accept that search results may be slightly stale and handle conflicts at reservation time with retry suggestions
- Use CQRS: separate the read path (cached availability for browsing) from the write path (transactional reservations)
4. Peak Load Management
Popular restaurants see reservation stampedes when new booking windows open. The system must handle surges without cascading failures.
Hints to consider:
- Implement rate limiting per user to prevent automated booking bots
- Use a virtual queue with position tracking when demand exceeds capacity for specific restaurants
- Deploy circuit breakers between the search and reservation services to prevent slow reservations from blocking searches
- Consider backpressure: if the reservation service is overwhelmed, return "try again in X seconds" rather than timing out
Suggested Approach
Step 1: Clarify Requirements
Confirm scope with your interviewer. Ask about expected scale (number of restaurants, concurrent users, daily reservations) and geographic distribution. Clarify whether the system handles walk-in waitlists, restaurant-specific policies (minimum party size, advance booking window), and payment integration. Determine if time zone handling across regions is in scope.