Design an Inventory Management System — Blue Origin

Problem Statement

Design an inventory management system for a large retailer or e-commerce company operating multiple warehouses. The system must track real-time stock levels across all locations, process order fulfillment with accurate inventory deductions, support receiving and adjusting stock, trigger automated reordering when quantities drop below thresholds, and generate inventory reports. Warehouse staff, e-commerce platforms, and supply-chain systems all interact with inventory data concurrently.

The core engineering challenges are preventing overselling under high concurrency (the classic double-booking problem applied to stock units), orchestrating multi-step order fulfillment workflows that span reservation, payment, picking, and shipping, separating heavy analytical queries from the transactional write path, and providing real-time visibility into inventory movements. You will need to reason about contention management, saga-based workflows, event-driven architecture, and CQRS patterns.

Key Requirements

Functional

Product and SKU management -- maintain product catalog with attributes, pricing, and per-location availability across multiple warehouses
Stock tracking and adjustments -- track inventory levels by SKU and location; support receiving shipments, cycle counts, transfers, and manual corrections with audit trails
Order fulfillment -- reserve inventory for incoming orders, coordinate picking and packing, deduct stock on shipment, and handle cancellations and returns
Automated reordering -- trigger purchase orders when stock falls below configurable reorder points based on lead times and demand velocity
Reporting and analytics -- generate stock-on-hand, low-stock alerts, inventory valuation, and historical movement reports

Non-Functional

Scalability -- support 100-plus warehouses, one million-plus SKUs globally, and 20,000-plus inventory transactions per second during peak periods
Consistency -- strong consistency for inventory reservations to prevent overselling; eventual consistency acceptable for analytics and reporting
Availability -- 99.95 percent uptime; warehouse operations must continue even during partial network partitions
Latency -- inventory availability checks under 100 ms; order reservation under 200 ms; real-time stock updates propagated within one second

What Interviewers Focus On

Based on real interview experiences at Blue Origin, these are the areas interviewers probe most deeply:

1. Concurrency Control and Preventing Overselling

The defining challenge. Hundreds of concurrent order-fulfillment operations, receiving events, and cycle counts all modify the same inventory records. Interviewers want to see explicit strategies for preventing race conditions without killing throughput.

Hints to consider:

Use conditional updates (UPDATE ... WHERE available_qty >= requested_qty) in PostgreSQL to atomically check and decrement in a single statement, preventing oversells without explicit locks
For high-contention SKUs (flash sales), consider pre-allocating inventory into reservation pools or sharded counters to reduce row-level lock contention
Separate "available" from "reserved" and "committed" quantities so that in-flight reservations do not block new availability checks
Implement reservation TTLs: if an order is not confirmed within a timeout window, release the reserved quantity back to the available pool

2. Multi-Step Order Fulfillment Workflow

An order moves through reservation, payment authorization, picking, packing, shipping, and potentially returns. Each step can fail independently. Interviewers expect saga-based orchestration with compensation.

Hints to consider:

Model the fulfillment workflow as a saga with explicit states: RESERVED, PAYMENT_AUTHORIZED, PICKING, PACKED, SHIPPED, DELIVERED, RETURNED
Each step publishes an event to Kafka; downstream services react and advance the state or trigger compensation (for example, payment failure releases the reservation)
Attach idempotency keys to every state transition to make retries safe after crashes
Use a reservation cleanup job that periodically scans for reservations stuck in RESERVED beyond their TTL and releases them

3. Event-Driven Architecture and Real-Time Visibility

Operators, dashboards, and downstream systems need immediate visibility into every inventory movement. Interviewers evaluate how you propagate changes without coupling producers to consumers.

Hints to consider:

Publish an inventory-movement event to Kafka for every stock change (received, reserved, picked, adjusted, transferred) with before and after quantities
Downstream consumers update real-time dashboards, trigger reorder checks, feed analytics pipelines, and synchronize search indexes
Use an outbox pattern: write the event to an outbox table in the same database transaction as the inventory update, then a separate process publishes to Kafka, guaranteeing consistency
Accept eventual consistency for analytics while keeping the transactional path strongly consistent

4. Separating OLTP from Analytics (CQRS)

Running heavy reports (inventory valuation, turnover analysis, demand forecasting) against the primary transactional database degrades order-processing performance. Interviewers look for explicit read/write separation.

Hints to consider:

Use Change Data Capture (CDC) from PostgreSQL to stream inventory changes to a columnar analytics database (ClickHouse or Snowflake)
Precompute common aggregates (stock on hand by location, weekly turnover rates) as materialized views updated incrementally from the event stream
Serve low-stock alerts from the event stream rather than polling the transactional database
Partition analytics tables by date for efficient historical queries and cost-effective retention

5. Multi-Warehouse Allocation and Routing

When an order can be fulfilled from multiple warehouses, the system must select the optimal source based on inventory availability, proximity to the shipping destination, and warehouse capacity.

Hints to consider:

Implement a routing service that queries available inventory across warehouses and applies a scoring function (distance, stock depth, warehouse load) to select the best fulfillment location
Cache per-warehouse availability summaries in Redis for fast routing decisions, with invalidation driven by inventory events
Support split shipments when no single warehouse has all items, but prefer single-source fulfillment to reduce shipping cost
Handle the edge case where cached availability is stale: the reservation step will fail at the chosen warehouse, and the router retries with the next-best option

Suggested Approach

Step 1: Clarify Requirements

Confirm the number of warehouses, SKU count, and peak transaction volume. Ask whether the system serves a single e-commerce platform or multiple sales channels (marketplace, retail stores, wholesale). Clarify the fulfillment model: does the warehouse do picking and packing, or is it drop-ship? Determine reporting SLAs: are real-time dashboards required, or are hourly batch reports sufficient? Verify whether the system integrates with external ERP or warehouse-management hardware (barcode scanners, conveyor systems).

Step 2: High-Level Architecture

Sketch the main components. Inventory service: owns stock levels, handles reservations, adjustments, and transfers; backed by PostgreSQL sharded by warehouse ID. Order fulfillment service: orchestrates the saga from reservation through shipping. Routing service: selects the optimal warehouse for each order. Reorder service: monitors stock levels and creates purchase orders when thresholds are breached. Event stream (Kafka): carries inventory-movement events for downstream consumers. Cache layer (Redis): caches available quantities for fast reads and routing decisions. Analytics database (ClickHouse): receives CDC from PostgreSQL for reporting and demand forecasting. API gateway: routes requests from e-commerce platforms, warehouse devices, and admin dashboards.

Step 3: Deep Dive on Inventory Reservation

Walk through the reservation path. An order arrives from the e-commerce platform. The routing service queries Redis for per-warehouse availability and selects the best warehouse. The inventory service starts a PostgreSQL transaction: it executes UPDATE inventory SET available_qty = available_qty - :requested, reserved_qty = reserved_qty + :requested WHERE sku_id = :sku AND warehouse_id = :wh AND available_qty >= :requested. If the update affects one row, the reservation succeeds; the service inserts a reservation record with an expiration timestamp, writes an inventory-movement event to the outbox table, and commits the transaction. A background process publishes the outbox event to Kafka. If the update affects zero rows (insufficient stock), the service returns an error and the router retries with the next warehouse. Discuss how this single atomic UPDATE prevents overselling without explicit locks, and how the reservation TTL ensures abandoned orders release stock.

Step 4: Address Secondary Concerns

Cover the fulfillment saga: payment authorization, picking assignment, packing confirmation, shipping label generation, and carrier handoff, with compensation at each step. Discuss reordering: a Kafka consumer tracks inventory levels and triggers purchase orders when available quantity minus reserved quantity falls below the reorder point, factoring in lead time and demand velocity. Address analytics: CDC from PostgreSQL to ClickHouse for inventory valuation, turnover rates, and demand forecasting. Mention monitoring: track reservation latency, saga completion rate, stock discrepancy alerts, and Kafka consumer lag. Discuss scaling: shard PostgreSQL by warehouse ID, scale the inventory service horizontally, and use read replicas for non-critical queries.

Related Learning Resources

Design a Payment System -- covers saga-based orchestration, idempotency, and multi-step transactional workflows directly applicable to order fulfillment
Building block: Databases -- foundational concepts for transactional storage, sharding, and consistency models relevant to inventory management

Problem Statement

Key Requirements

Functional

Product and SKU management -- maintain product catalog with attributes, pricing, and per-location availability across multiple warehouses
Stock tracking and adjustments -- track inventory levels by SKU and location; support receiving shipments, cycle counts, transfers, and manual corrections with audit trails
Order fulfillment -- reserve inventory for incoming orders, coordinate picking and packing, deduct stock on shipment, and handle cancellations and returns
Automated reordering -- trigger purchase orders when stock falls below configurable reorder points based on lead times and demand velocity
Reporting and analytics -- generate stock-on-hand, low-stock alerts, inventory valuation, and historical movement reports

Non-Functional

Scalability -- support 100-plus warehouses, one million-plus SKUs globally, and 20,000-plus inventory transactions per second during peak periods
Consistency -- strong consistency for inventory reservations to prevent overselling; eventual consistency acceptable for analytics and reporting
Availability -- 99.95 percent uptime; warehouse operations must continue even during partial network partitions
Latency -- inventory availability checks under 100 ms; order reservation under 200 ms; real-time stock updates propagated within one second

What Interviewers Focus On

Based on real interview experiences at Blue Origin, these are the areas interviewers probe most deeply:

1. Concurrency Control and Preventing Overselling

Hints to consider:

Use conditional updates (UPDATE ... WHERE available_qty >= requested_qty) in PostgreSQL to atomically check and decrement in a single statement, preventing oversells without explicit locks
For high-contention SKUs (flash sales), consider pre-allocating inventory into reservation pools or sharded counters to reduce row-level lock contention
Separate "available" from "reserved" and "committed" quantities so that in-flight reservations do not block new availability checks
Implement reservation TTLs: if an order is not confirmed within a timeout window, release the reserved quantity back to the available pool

2. Multi-Step Order Fulfillment Workflow

Hints to consider:

Model the fulfillment workflow as a saga with explicit states: RESERVED, PAYMENT_AUTHORIZED, PICKING, PACKED, SHIPPED, DELIVERED, RETURNED
Each step publishes an event to Kafka; downstream services react and advance the state or trigger compensation (for example, payment failure releases the reservation)
Attach idempotency keys to every state transition to make retries safe after crashes
Use a reservation cleanup job that periodically scans for reservations stuck in RESERVED beyond their TTL and releases them

3. Event-Driven Architecture and Real-Time Visibility

Operators, dashboards, and downstream systems need immediate visibility into every inventory movement. Interviewers evaluate how you propagate changes without coupling producers to consumers.

Hints to consider:

Publish an inventory-movement event to Kafka for every stock change (received, reserved, picked, adjusted, transferred) with before and after quantities
Downstream consumers update real-time dashboards, trigger reorder checks, feed analytics pipelines, and synchronize search indexes
Use an outbox pattern: write the event to an outbox table in the same database transaction as the inventory update, then a separate process publishes to Kafka, guaranteeing consistency
Accept eventual consistency for analytics while keeping the transactional path strongly consistent

4. Separating OLTP from Analytics (CQRS)

Hints to consider:

Use Change Data Capture (CDC) from PostgreSQL to stream inventory changes to a columnar analytics database (ClickHouse or Snowflake)
Precompute common aggregates (stock on hand by location, weekly turnover rates) as materialized views updated incrementally from the event stream
Serve low-stock alerts from the event stream rather than polling the transactional database
Partition analytics tables by date for efficient historical queries and cost-effective retention

5. Multi-Warehouse Allocation and Routing

When an order can be fulfilled from multiple warehouses, the system must select the optimal source based on inventory availability, proximity to the shipping destination, and warehouse capacity.

Hints to consider:

Implement a routing service that queries available inventory across warehouses and applies a scoring function (distance, stock depth, warehouse load) to select the best fulfillment location
Cache per-warehouse availability summaries in Redis for fast routing decisions, with invalidation driven by inventory events
Support split shipments when no single warehouse has all items, but prefer single-source fulfillment to reduce shipping cost
Handle the edge case where cached availability is stale: the reservation step will fail at the chosen warehouse, and the router retries with the next-best option

Suggested Approach

Step 1: Clarify Requirements

Step 2: High-Level Architecture

Step 3: Deep Dive on Inventory Reservation

Step 4: Address Secondary Concerns

Related Learning Resources

Design a Payment System -- covers saga-based orchestration, idempotency, and multi-step transactional workflows directly applicable to order fulfillment
Building block: Databases -- foundational concepts for transactional storage, sharding, and consistency models relevant to inventory management