Design a Restaurant Booking Service — C3 AI

Problem Statement

Design a restaurant booking system that allows users to search for restaurants by name, cuisine, or location (defaulting to a 30-mile radius), view real-time table availability, and make, edit, or cancel reservations for individuals or groups. The platform serves both diners looking for a seamless reservation experience and restaurant operators managing limited table inventory during peak hours.

Consider the challenges of a Friday evening at 7 PM when hundreds of users simultaneously try to reserve tables at popular restaurants. Your system must prevent double-bookings, handle concurrent reservation attempts gracefully, and keep availability information fresh across web and mobile clients. Restaurant operators should be able to configure table layouts, time slots, and booking policies, while diners receive instant confirmation and timely reminders.

Key Requirements

Functional

Restaurant search -- users search by name, cuisine type, or geographic location, with results defaulting to a 30-mile radius around the user's position
Availability browsing -- users view available time slots for a given date, time, and party size, with near real-time accuracy reflecting ongoing bookings
Reservation creation -- users book a table for a specific date, time, and party size, receiving instant confirmation with a unique reservation code
Reservation modification -- users edit an existing reservation's time, date, or party size, with the system validating availability for the new parameters before releasing the old slot
Reservation cancellation -- users cancel a reservation and receive confirmation, with the freed slot immediately becoming available to other diners

Non-Functional

Scalability -- support millions of restaurant listings and hundreds of thousands of concurrent searches during peak dining hours
Reliability -- guarantee zero double-bookings through strong consistency on reservation writes, with 99.9% uptime for the booking path
Latency -- search results returned within 300ms at p95, reservation confirmation within 2 seconds end-to-end
Consistency -- eventual consistency acceptable for search index freshness, but strong consistency required for table inventory and reservation commits

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Table Inventory Contention and Double-Booking Prevention

The most critical challenge is preventing two diners from reserving the same table at the same time, especially for popular restaurants during peak hours when dozens of requests target the same slots simultaneously.

Hints to consider:

Use optimistic concurrency control or SELECT FOR UPDATE on the specific table-timeslot combination to serialize competing writes
Implement short-lived holds with TTLs so that users browsing the checkout flow temporarily lock a slot without permanently consuming it
Consider modeling availability as discrete time-slot records per table group rather than computing availability on the fly from raw reservation data
Design compensating logic for the case where a hold expires mid-checkout or a payment step fails after the slot was reserved

2. Location-Based and Text Search at Scale

Users expect fast, relevant results when searching by cuisine, restaurant name, or proximity. Running these queries directly against the transactional database would create unacceptable latency and contention.

Hints to consider:

Offload search to a dedicated index like Elasticsearch with geo-distance queries for the 30-mile radius default and full-text analyzers for name and cuisine matching
Keep the search index eventually consistent with the primary data store by publishing change events through a message bus or CDC pipeline
Cache popular search results and trending restaurants per neighborhood to absorb repeated queries during peak hours
Re-validate availability against the source of truth at the moment of booking to handle stale search results gracefully

3. Reservation Edit as a Safe Swap

Modifying a reservation is more complex than a simple update because changing the time or party size requires securing a new slot before releasing the old one. A naive in-place update risks losing both slots if something fails mid-operation.

Hints to consider:

Treat an edit as an atomic "hold new slot, then release old slot" operation rather than updating the existing record in place
If the new slot cannot be secured, the original reservation should remain untouched and the user should receive clear feedback
Use a saga or two-step workflow that coordinates the new hold, confirmation, and old-slot release with rollback if any step fails
Consider idempotency tokens on edit requests so retries do not create duplicate reservations

4. Real-Time Availability Updates

Diners and restaurant staff both need to see availability changes reflected quickly. A table that was just booked should disappear from other users' views within seconds, not minutes.

Hints to consider:

Push availability changes to connected clients via WebSockets or Server-Sent Events scoped to the restaurant being viewed
Maintain a fast-read availability cache in Redis that is updated synchronously on every reservation write, serving as the hot path for availability checks
Publish reservation events to a message bus for fan-out to search index updates, analytics, and notification services
Design for graceful degradation: if the real-time channel is unavailable, fall back to polling with a short interval

5. Restaurant Configuration and Booking Policies

Restaurants have varying table layouts, seating capacities, booking windows, and policies (minimum party size, cancellation deadlines, no-show penalties). Your data model must be flexible enough to support these without overcomplicating the booking logic.

Hints to consider:

Model table groups (e.g., "2-tops", "4-tops", "large party") rather than individual physical tables to simplify availability math and allow flexible assignment at the restaurant level
Store booking policies as configurable rules per restaurant: advance booking window, cancellation cutoff, maximum party size, and time slot duration
Allow restaurants to define blackout dates, special event pricing, and holiday hours as overrides on top of their regular schedule
Keep the booking engine policy-agnostic by evaluating rules at validation time rather than embedding them in the reservation flow

Suggested Approach

Step 1: Clarify Requirements

Begin by confirming scope with your interviewer. Ask about expected scale: how many restaurants, how many concurrent diners during peak hours, and the geographic coverage (single city vs. multi-region). Clarify whether the system handles payments or just reservations. Confirm whether group reservations have different rules than individual ones, whether waitlists are in scope, and how restaurant operators manage their table inventory. Establish latency targets for search versus booking confirmation, and determine the acceptable staleness for availability data shown in search results.

Step 2: High-Level Architecture

Sketch the core services and data flow. A Search Service backed by Elasticsearch handles geo-proximity and text queries across restaurant listings. An Availability Service manages table inventory with a strongly consistent Postgres store and a Redis cache for fast reads. A Booking Service orchestrates the reservation lifecycle -- holds, confirmations, edits, and cancellations -- using saga-style coordination across availability, notifications, and optional payment services. A Restaurant Management Service allows operators to configure table layouts, time slots, and policies. An API Gateway sits in front for authentication, rate limiting, and routing. A message queue like Kafka propagates reservation events to the search index, analytics, and notification systems asynchronously.

Step 3: Deep Dive on Booking Flow and Contention

Walk through the critical path when a diner confirms a reservation. The Booking Service receives the request with restaurant ID, date, time, and party size. It calls the Availability Service, which acquires a row-level lock on the matching table-group timeslot record in Postgres, checks that sufficient capacity remains, and decrements the available count while writing a hold record with a short TTL. The hold token is returned to the Booking Service, which proceeds with any confirmation steps (e.g., notification to the restaurant). On success, the hold is promoted to a confirmed reservation. On failure or timeout, the hold expires and the count is restored automatically by a cleanup job or Redis TTL callback. Discuss sharding the availability table by restaurant ID to distribute write load, and using optimistic locking with version columns for restaurants with lower contention.

Step 4: Address Secondary Concerns

Cover how search scales by partitioning Elasticsearch indexes geographically, caching popular queries at CDN and application layers, and syncing availability summaries from the booking path through CDC or event publishing. Discuss the edit flow as a hold-swap saga with clear rollback semantics. Address monitoring with metrics on booking funnel conversion, hold expiration rates, search latency percentiles, and availability cache hit ratios. Mention disaster recovery with Postgres replicas and Elasticsearch cluster redundancy. If time allows, discuss how waitlists, no-show tracking, and overbooking buffers extend the basic design.

Problem Statement

Key Requirements

Functional

Restaurant search -- users search by name, cuisine type, or geographic location, with results defaulting to a 30-mile radius around the user's position
Availability browsing -- users view available time slots for a given date, time, and party size, with near real-time accuracy reflecting ongoing bookings
Reservation creation -- users book a table for a specific date, time, and party size, receiving instant confirmation with a unique reservation code
Reservation modification -- users edit an existing reservation's time, date, or party size, with the system validating availability for the new parameters before releasing the old slot
Reservation cancellation -- users cancel a reservation and receive confirmation, with the freed slot immediately becoming available to other diners

Non-Functional

Scalability -- support millions of restaurant listings and hundreds of thousands of concurrent searches during peak dining hours
Reliability -- guarantee zero double-bookings through strong consistency on reservation writes, with 99.9% uptime for the booking path
Latency -- search results returned within 300ms at p95, reservation confirmation within 2 seconds end-to-end
Consistency -- eventual consistency acceptable for search index freshness, but strong consistency required for table inventory and reservation commits

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Table Inventory Contention and Double-Booking Prevention

Hints to consider:

Use optimistic concurrency control or SELECT FOR UPDATE on the specific table-timeslot combination to serialize competing writes
Implement short-lived holds with TTLs so that users browsing the checkout flow temporarily lock a slot without permanently consuming it
Consider modeling availability as discrete time-slot records per table group rather than computing availability on the fly from raw reservation data
Design compensating logic for the case where a hold expires mid-checkout or a payment step fails after the slot was reserved

2. Location-Based and Text Search at Scale

Hints to consider:

Offload search to a dedicated index like Elasticsearch with geo-distance queries for the 30-mile radius default and full-text analyzers for name and cuisine matching
Keep the search index eventually consistent with the primary data store by publishing change events through a message bus or CDC pipeline
Cache popular search results and trending restaurants per neighborhood to absorb repeated queries during peak hours
Re-validate availability against the source of truth at the moment of booking to handle stale search results gracefully

3. Reservation Edit as a Safe Swap

Hints to consider:

Treat an edit as an atomic "hold new slot, then release old slot" operation rather than updating the existing record in place
If the new slot cannot be secured, the original reservation should remain untouched and the user should receive clear feedback
Use a saga or two-step workflow that coordinates the new hold, confirmation, and old-slot release with rollback if any step fails
Consider idempotency tokens on edit requests so retries do not create duplicate reservations

4. Real-Time Availability Updates

Diners and restaurant staff both need to see availability changes reflected quickly. A table that was just booked should disappear from other users' views within seconds, not minutes.

Hints to consider:

Push availability changes to connected clients via WebSockets or Server-Sent Events scoped to the restaurant being viewed
Maintain a fast-read availability cache in Redis that is updated synchronously on every reservation write, serving as the hot path for availability checks
Publish reservation events to a message bus for fan-out to search index updates, analytics, and notification services
Design for graceful degradation: if the real-time channel is unavailable, fall back to polling with a short interval

5. Restaurant Configuration and Booking Policies

Hints to consider:

Model table groups (e.g., "2-tops", "4-tops", "large party") rather than individual physical tables to simplify availability math and allow flexible assignment at the restaurant level
Store booking policies as configurable rules per restaurant: advance booking window, cancellation cutoff, maximum party size, and time slot duration
Allow restaurants to define blackout dates, special event pricing, and holiday hours as overrides on top of their regular schedule
Keep the booking engine policy-agnostic by evaluating rules at validation time rather than embedding them in the reservation flow