Practice/Meta/Design Yelp

Design Yelp

System DesignMust

Problem Statement

Design a food delivery platform similar to DoorDash or Uber Eats that connects customers with restaurants and delivery drivers. The system must handle the entire lifecycle: customers browse menus, place orders, restaurants receive and prepare them, and drivers pick up and deliver food in real-time. The platform should support millions of active users across hundreds of cities, with peak traffic during lunch and dinner hours reaching 100,000+ concurrent orders.

The core challenge lies in coordinating three distinct user types with conflicting needs: customers want fast delivery and accurate ETAs, restaurants need time to prepare quality food, and drivers want efficient routing to maximize earnings. Your design must balance real-time location tracking, dynamic pricing, reliable order state machines, and fair driver assignment while maintaining sub-second search performance and handling payment processing at scale.

Key Requirements

Functional

Restaurant Discovery -- customers search and filter restaurants by cuisine, distance, rating, delivery time, and price range within their delivery area
Order Placement -- customers add items to cart, apply promotions, choose delivery address, pay securely, and receive order confirmation with estimated delivery time
Order Fulfillment -- restaurants receive orders, confirm availability, update preparation status, and notify when ready for pickup
Driver Assignment -- system matches available drivers to ready orders based on proximity, capacity, and route optimization
Real-Time Tracking -- customers and restaurants see live driver location and updated ETAs as delivery progresses
Order History -- all parties access past orders with receipts, ratings, and issue resolution options

Non-Functional

Scalability -- support 100k concurrent orders during peak hours across 500+ cities with graceful degradation under load spikes
Reliability -- maintain 99.95% uptime for order placement and payment processing with no lost or duplicate orders even during failures
Latency -- restaurant search returns results in under 300ms at p95, real-time tracking updates every 5-10 seconds with minimal lag
Consistency -- order state transitions follow strict sequencing (placed → confirmed → preparing → ready → picked up → delivered) with exactly-once semantics for payments

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Driver-Order Matching and Assignment

How you assign the right driver to each order is the most critical algorithmic challenge. Naive approaches like "nearest available driver" create inefficiency and poor earnings distribution. Interviewers want to see you reason about batching multiple orders, predicting driver availability windows, and balancing fairness with speed.

Hints to consider:

Use geohashing or quadtrees to find candidate drivers within a reasonable radius (e.g., 3 miles) rather than scanning all active drivers
Consider a scoring function that weighs distance, estimated pickup time, driver acceptance rate, and route compatibility for stacked orders
Implement a fallback broadcast mechanism where orders are offered to expanding rings of drivers if initial assignments are rejected
Account for driver preferences (e.g., avoiding highways) and vehicle type constraints (bike vs car for large orders)

2. Real-Time Location Tracking at Scale

Handling millions of GPS coordinate updates per minute from active drivers while serving low-latency queries to customers is non-trivial. Interviewers expect you to discuss write amplification, staleness tolerance, and efficient spatial indexing.

Hints to consider:

Batch location updates from mobile clients (every 5-10 seconds) and use UDP or WebSockets to reduce connection overhead
Store recent driver positions in Redis with geospatial indexes (GEOADD/GEORADIUS) for fast proximity queries with TTL-based cleanup
Use Kafka to stream location events for analytics and fraud detection without blocking the hot path
Implement client-side smoothing and dead reckoning to handle network interruptions gracefully and avoid jittery map pins

3. Order State Management and Consistency

Orders transition through multiple states across three different actors, and any inconsistency leads to bad experiences (double-charging, lost orders, stuck deliveries). Interviewers probe how you enforce correct sequencing and handle failures.

Hints to consider:

Model orders as finite state machines with explicit allowed transitions stored in PostgreSQL with optimistic locking or version numbers
Use Kafka or event sourcing to create an immutable audit trail of state changes with idempotency keys to prevent duplicate processing
Implement compensation logic (saga pattern) for rollbacks when payment succeeds but restaurant confirmation fails
Separate payment authorization from capture -- authorize on order placement, capture only when restaurant confirms to avoid holding funds for cancelled orders

4. Restaurant Search and Menu Data

Customers expect instant search results filtered by dozens of attributes, but menu data changes frequently (items go out of stock, prices update, hours change). Balancing freshness with query performance is key.

Hints to consider:

Use Elasticsearch with per-city sharding and geo_point queries to find restaurants within delivery radius with sub-300ms latency
Denormalize popular items and aggregate ratings into the search document to avoid joins, but keep full menus in PostgreSQL as source of truth
Implement cache-aside pattern with Redis for hot restaurant detail pages (TTL 5-15 minutes) with cache invalidation on menu updates via CDC
Handle dynamic availability by maintaining an in-memory bit vector of open/closed restaurants updated every minute rather than querying each result

Practice/Meta/Design Yelp

Design Yelp

System DesignMust

Problem Statement

Key Requirements

Functional

Restaurant Discovery -- customers search and filter restaurants by cuisine, distance, rating, delivery time, and price range within their delivery area
Order Placement -- customers add items to cart, apply promotions, choose delivery address, pay securely, and receive order confirmation with estimated delivery time
Order Fulfillment -- restaurants receive orders, confirm availability, update preparation status, and notify when ready for pickup
Driver Assignment -- system matches available drivers to ready orders based on proximity, capacity, and route optimization
Real-Time Tracking -- customers and restaurants see live driver location and updated ETAs as delivery progresses
Order History -- all parties access past orders with receipts, ratings, and issue resolution options

Non-Functional

Scalability -- support 100k concurrent orders during peak hours across 500+ cities with graceful degradation under load spikes
Reliability -- maintain 99.95% uptime for order placement and payment processing with no lost or duplicate orders even during failures
Latency -- restaurant search returns results in under 300ms at p95, real-time tracking updates every 5-10 seconds with minimal lag
Consistency -- order state transitions follow strict sequencing (placed → confirmed → preparing → ready → picked up → delivered) with exactly-once semantics for payments

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Driver-Order Matching and Assignment

Hints to consider:

Use geohashing or quadtrees to find candidate drivers within a reasonable radius (e.g., 3 miles) rather than scanning all active drivers
Consider a scoring function that weighs distance, estimated pickup time, driver acceptance rate, and route compatibility for stacked orders
Implement a fallback broadcast mechanism where orders are offered to expanding rings of drivers if initial assignments are rejected
Account for driver preferences (e.g., avoiding highways) and vehicle type constraints (bike vs car for large orders)

2. Real-Time Location Tracking at Scale

Hints to consider:

Batch location updates from mobile clients (every 5-10 seconds) and use UDP or WebSockets to reduce connection overhead
Store recent driver positions in Redis with geospatial indexes (GEOADD/GEORADIUS) for fast proximity queries with TTL-based cleanup
Use Kafka to stream location events for analytics and fraud detection without blocking the hot path
Implement client-side smoothing and dead reckoning to handle network interruptions gracefully and avoid jittery map pins

3. Order State Management and Consistency

Hints to consider:

Model orders as finite state machines with explicit allowed transitions stored in PostgreSQL with optimistic locking or version numbers
Use Kafka or event sourcing to create an immutable audit trail of state changes with idempotency keys to prevent duplicate processing
Implement compensation logic (saga pattern) for rollbacks when payment succeeds but restaurant confirmation fails
Separate payment authorization from capture -- authorize on order placement, capture only when restaurant confirms to avoid holding funds for cancelled orders

4. Restaurant Search and Menu Data

Hints to consider:

Use Elasticsearch with per-city sharding and geo_point queries to find restaurants within delivery radius with sub-300ms latency
Denormalize popular items and aggregate ratings into the search document to avoid joins, but keep full menus in PostgreSQL as source of truth
Implement cache-aside pattern with Redis for hot restaurant detail pages (TTL 5-15 minutes) with cache invalidation on menu updates via CDC
Handle dynamic availability by maintaining an in-memory bit vector of open/closed restaurants updated every minute rather than querying each result