Practice/Databricks/Design a Book Seller Platform
Design a Book Seller Platform
System DesignMust
Problem Statement
Design a dynamic pricing platform for short-term home rentals (similar to Airbnb's pricing engine) that helps property owners automatically set competitive nightly rates based on real-time market conditions. The system should analyze comparable listings in the area, seasonal demand patterns, local events, and property characteristics to suggest optimal pricing. Property owners can configure pricing rules, set floor and ceiling prices, and receive daily recommendations. The platform must handle 50,000 properties with pricing updates every hour, process pricing queries in under 500ms, and integrate with external data sources for events, weather, and competitor pricing.
As rental markets become increasingly competitive, hosts need automated tools to maximize occupancy while staying price-competitive. This system must balance multiple data sources—some updated in real-time, others batch-refreshed daily—and produce pricing recommendations that reflect both immediate market conditions and longer-term trends. Interviewers use this question to assess your ability to design a data-intensive pipeline with heterogeneous refresh rates, build a responsive recommendation API under strict latency constraints, handle schema evolution from third-party feeds, and implement smart caching strategies that balance freshness with performance at scale.
Key Requirements
Functional
- Property Configuration -- Owners register properties with attributes (location, bedrooms, amenities) and set pricing constraints including minimum and maximum nightly rates
- Pricing Recommendations -- System generates daily price suggestions based on comparable listings, demand forecasts, upcoming events, and historical booking patterns
- Manual Override -- Owners can accept, reject, or modify system recommendations and set custom prices for specific date ranges
- Performance Dashboard -- Owners view historical pricing decisions, occupancy rates, revenue metrics, and comparison against market benchmarks
Non-Functional
- Scalability -- Support 50,000 active properties with hourly recommendation updates and handle 1,000 concurrent pricing queries per second during peak hours
- Reliability -- Continue providing pricing recommendations even when external data sources are unavailable using cached fallback data
- Latency -- Serve pricing recommendations within 500ms at p99 and property search results within 200ms
- Consistency -- Ensure pricing data displayed to guests matches owner-set prices with eventual consistency acceptable within 5 minutes for recommendation updates
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Data Pipeline Architecture for Multi-Source Ingestion
The system must combine data from various sources refreshed at different intervals: competitor pricing scraped hourly, event calendars updated daily, weather forecasts refreshed every 6 hours, and booking transactions streaming in real-time. Interviewers want to see how you handle schema drift, late-arriving data, and partial source failures.
Hints to consider:
- Design separate ingestion pipelines per source type with independent retry and backoff logic to prevent cascading failures
- Use a unified data lake or warehouse layer where raw feeds land before transformation to decouple ingestion from consumption
- Implement schema versioning and validation at ingestion boundaries to catch breaking changes early and maintain backward compatibility
- Consider lambda architecture with both batch reprocessing for historical corrections and stream processing for real-time signals
2. Real-Time Pricing Model Execution Under Latency Constraints
Computing a price recommendation requires joining property features, nearby comparable listings, demand forecasts, event calendars, and booking history—potentially touching millions of records. Meeting the 500ms SLA demands careful optimization of both data retrieval and model execution.
Hints to consider:
- Pre-compute and materialize intermediate results like comparable property clusters and demand indices during off-peak batch windows
- Use geospatial indexes and proximity caching to quickly retrieve relevant comparables within a radius without full table scans
- Implement multi-tier caching with hot property data in Redis and warm aggregates in a denormalized query store like Elasticsearch
- Consider feature stores that maintain precomputed features per property refreshed asynchronously so the API only executes lightweight scoring
3. Handling External Data Source Failures and Staleness
Third-party APIs for event data, weather, and competitor prices are outside your control and may experience downtime, rate limits, or increased latency. The system must degrade gracefully without serving wildly incorrect pricing or timing out user requests.
Hints to consider:
- Maintain versioned snapshots of external data with timestamps so the pricing engine can fall back to the most recent available data
- Implement circuit breakers per external source with exponential backoff and fast-fail semantics to prevent thread exhaustion
- Use probabilistic staleness indicators that inform the recommendation confidence score shown to owners when data is outdated
- Design a data quality monitoring layer that alerts when key signals like event data or competitor pricing haven't refreshed within expected SLAs
4. Pricing Consistency and Owner Override Management
When an owner manually overrides a system recommendation, those changes must propagate to all guest-facing surfaces quickly. Simultaneously, the system must continue generating fresh recommendations without overwriting manual overrides or creating race conditions during concurrent updates.
Hints to consider:
- Maintain separate tables or columns for system-generated recommendations and owner-confirmed prices with explicit precedence rules
- Use optimistic locking or versioning on price records to detect and resolve concurrent modifications from owner dashboard and batch updater
- Implement an event-driven architecture where price changes emit events consumed by cache invalidation and guest-facing search indexes
- Consider a write-ahead log or outbox pattern to guarantee that price updates are durably recorded before acknowledging to the owner