Practice/Bloomberg/Design a Train Reservation System

Design a Train Reservation System

System DesignMust

Problem Statement

Design a seat reservation platform for a national high-speed rail network that serves millions of passengers daily across hundreds of routes. Travelers need to search train schedules between any two stations, view real-time seat availability and pricing, and complete secure bookings within seconds. The system must handle extreme demand spikes when new routes launch or holiday ticket sales open, while guaranteeing zero double-bookings even when thousands of users compete for the last remaining seats on a popular departure.

The core modeling challenge lies in segment-based inventory: a seat occupied from Station A to Station C is unavailable for the A-to-B leg, but becomes available again after Station C for the C-to-D portion of the journey. Your design must maintain strict transactional consistency during concurrent bookings, implement a reliable multi-phase reservation workflow (search, hold, payment, ticketing), and gracefully degrade under peak load without compromising data integrity. Bloomberg interviews feature this problem because it tests precise concurrency control, multi-step workflow orchestration, and capacity management under contention -- skills directly applicable to financial transaction systems.

Key Requirements

Functional

Schedule search -- users query train schedules between origin and destination stations for specific dates, viewing departure times, journey durations, intermediate stops, and connection options
Seat availability and pricing -- display available seats by class (economy, business, first class) with dynamic pricing based on demand levels, advance purchase timing, and route popularity
Reservation workflow -- hold selected seats temporarily during the checkout process, process payment securely through external gateways, issue confirmed tickets with unique booking references, and send confirmations via email and SMS
Booking management -- users can view upcoming trips, cancel reservations with refund rules applied automatically, modify travel dates subject to availability, and download printable or mobile tickets

Non-Functional

Scalability -- support 100,000 concurrent users during flash sales, sustain 500 seat bookings per second, and scale horizontally as the route network expands
Reliability -- achieve 99.9 percent uptime for booking services, implement automatic failover for payment processing, and ensure zero data loss for confirmed reservations
Latency -- return search results within 300ms, complete seat holds within 500ms, and provide sub-second booking confirmation after successful payment
Consistency -- guarantee strong consistency for seat allocation (absolutely no overbooking); eventual consistency is acceptable for search indexes and availability display caches

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Segment-Based Inventory Management

The most common mistake candidates make is treating each train as having a single seat counter. A 400-seat train running from City A through B, C, and D actually has independent capacity for each segment (A-B, B-C, C-D). Seat 12A might be sold for the A-to-C portion, then resold for C-to-D, maximizing revenue and utilization.

Hints to consider:

Model each train journey as a series of segments with independent availability counters; when a passenger books A-to-C, atomically decrement both the A-B and B-C segment counters within a single database transaction
Use range-locking or exclusion constraints in PostgreSQL to prevent conflicting reservations on overlapping segments during concurrent transactions
Pre-compute availability snapshots for common origin-destination pairs (e.g., top 50 routes) to speed up search queries while maintaining accuracy through cache invalidation on booking events
Discuss the trade-off between normalizing segments (flexible but complex queries) versus denormalizing seat-segment mappings (faster lookups but higher storage and update cost)

2. Concurrency Control Under Contention

When a popular train has only 5 seats left and 50 users click "book" simultaneously, the system faces extreme write contention. Naive SELECT-then-UPDATE approaches create race conditions, while coarse-grained table locks destroy throughput.

Hints to consider:

Use PostgreSQL's SELECT FOR UPDATE SKIP LOCKED to let concurrent transactions grab different available seats without blocking each other
Implement optimistic locking with version numbers on segment availability rows to detect conflicts and retry only the failed transactions
Shard inventory by coach or car number so multiple booking requests targeting different coaches can proceed in parallel without contention
Maintain a fast-path availability counter in Redis with atomic DECR operations for preliminary availability checks, with periodic reconciliation against the authoritative database state

3. Multi-Phase Booking Workflow

Booking is not atomic from the user's perspective: they need time to review selections, enter passenger details, and wait for the payment gateway response. Meanwhile, held seats must be protected from other buyers but released if the session is abandoned or payment fails.

Hints to consider:

Implement a two-phase workflow: first create a temporary hold with a TTL (5 to 10 minutes), marking seats as HELD in the database; then convert to CONFIRMED status only after the payment gateway returns success
Use idempotency keys (client-generated UUIDs) to safely retry payment authorization without duplicate charges if network calls time out
Design compensating transactions (saga pattern) to roll back seat holds, refund payments, and clean up state when any step in the workflow fails
Store workflow state in a durable event log (Kafka outbox pattern) so background workers can detect and resolve abandoned holds or stuck payment authorizations

4. Peak Load Handling and Graceful Degradation

Flash sales for holiday travel or new route launches create traffic spikes 100 times normal load within minutes. The system must absorb these bursts without crashing while providing clear feedback to users.

Hints to consider:

Place rate limiters and admission control at the API gateway to shed excess load and prevent thundering herd from overwhelming the database
Implement virtual waiting rooms that queue excess users and drip-feed them to the booking service at a sustainable rate
Cache search results and availability snapshots aggressively (even with 10 to 30 second staleness), updating asynchronously as bookings complete
Use circuit breakers around the payment gateway and fall back to "pending confirmation" mode if the external service becomes temporarily unavailable

Suggested Approach

Step 1: Clarify Requirements

Start by confirming scope and priorities. Ask about scale: how many trains operate daily, how many stations are in the network, and what is the average journey length in segments? Clarify whether the system handles only direct journeys or also multi-leg trips with connections. Verify consistency requirements -- can search results show slightly stale availability, or must every displayed seat be guaranteed bookable? Confirm whether dynamic pricing, waitlists, or group bookings are in scope. Understand peak load patterns: is it gradual daily growth or sudden flash-sale spikes?

Practice/Bloomberg/Design a Train Reservation System

Design a Train Reservation System

System DesignMust

Problem Statement

Key Requirements

Functional

Schedule search -- users query train schedules between origin and destination stations for specific dates, viewing departure times, journey durations, intermediate stops, and connection options
Seat availability and pricing -- display available seats by class (economy, business, first class) with dynamic pricing based on demand levels, advance purchase timing, and route popularity
Reservation workflow -- hold selected seats temporarily during the checkout process, process payment securely through external gateways, issue confirmed tickets with unique booking references, and send confirmations via email and SMS
Booking management -- users can view upcoming trips, cancel reservations with refund rules applied automatically, modify travel dates subject to availability, and download printable or mobile tickets

Non-Functional

Scalability -- support 100,000 concurrent users during flash sales, sustain 500 seat bookings per second, and scale horizontally as the route network expands
Reliability -- achieve 99.9 percent uptime for booking services, implement automatic failover for payment processing, and ensure zero data loss for confirmed reservations
Latency -- return search results within 300ms, complete seat holds within 500ms, and provide sub-second booking confirmation after successful payment
Consistency -- guarantee strong consistency for seat allocation (absolutely no overbooking); eventual consistency is acceptable for search indexes and availability display caches

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Segment-Based Inventory Management

Hints to consider:

Model each train journey as a series of segments with independent availability counters; when a passenger books A-to-C, atomically decrement both the A-B and B-C segment counters within a single database transaction
Use range-locking or exclusion constraints in PostgreSQL to prevent conflicting reservations on overlapping segments during concurrent transactions
Pre-compute availability snapshots for common origin-destination pairs (e.g., top 50 routes) to speed up search queries while maintaining accuracy through cache invalidation on booking events
Discuss the trade-off between normalizing segments (flexible but complex queries) versus denormalizing seat-segment mappings (faster lookups but higher storage and update cost)

2. Concurrency Control Under Contention

Hints to consider:

Use PostgreSQL's SELECT FOR UPDATE SKIP LOCKED to let concurrent transactions grab different available seats without blocking each other
Implement optimistic locking with version numbers on segment availability rows to detect conflicts and retry only the failed transactions
Shard inventory by coach or car number so multiple booking requests targeting different coaches can proceed in parallel without contention
Maintain a fast-path availability counter in Redis with atomic DECR operations for preliminary availability checks, with periodic reconciliation against the authoritative database state

3. Multi-Phase Booking Workflow

Hints to consider:

Implement a two-phase workflow: first create a temporary hold with a TTL (5 to 10 minutes), marking seats as HELD in the database; then convert to CONFIRMED status only after the payment gateway returns success
Use idempotency keys (client-generated UUIDs) to safely retry payment authorization without duplicate charges if network calls time out
Design compensating transactions (saga pattern) to roll back seat holds, refund payments, and clean up state when any step in the workflow fails
Store workflow state in a durable event log (Kafka outbox pattern) so background workers can detect and resolve abandoned holds or stuck payment authorizations

4. Peak Load Handling and Graceful Degradation

Hints to consider:

Place rate limiters and admission control at the API gateway to shed excess load and prevent thundering herd from overwhelming the database
Implement virtual waiting rooms that queue excess users and drip-feed them to the booking service at a sustainable rate
Cache search results and availability snapshots aggressively (even with 10 to 30 second staleness), updating asynchronously as bookings complete
Use circuit breakers around the payment gateway and fall back to "pending confirmation" mode if the external service becomes temporarily unavailable