Practice/Meta/Design Meta Marketplace

Design Meta Marketplace

System DesignOptional

Problem Statement

Design a platform that enables users to buy and sell items within their local area, similar to Craigslist, OfferUp, or Meta Marketplace. Users should be able to post listings with photos and descriptions, discover items for sale near them using location-based search, filter by category and price, and communicate with sellers directly through the platform.

The system must handle millions of active listings across diverse geographic regions, support high read traffic from buyers browsing multiple times per day, and provide low-latency search results even in densely populated metropolitan areas. Location accuracy and freshness of listings are critical -- buyers expect to see newly posted items within minutes and want to avoid wasting time on sold items. The platform should scale gracefully as listing volumes and concurrent users grow.

Key Requirements

Functional

Listing creation -- sellers can post items with title, description, price, photos, category, and location
Proximity-based search -- buyers can discover listings within a specified radius, filtered by category, price range, and keywords
Listing management -- sellers can edit prices, mark items as sold, delete listings, and upload additional photos
Direct messaging -- buyers and sellers can exchange messages to negotiate price and arrange pickup
Saved searches and alerts -- users can save search criteria and receive notifications when matching listings appear

Non-Functional

Scalability -- support 100M+ active listings globally, 10K+ new listings per minute during peak hours, 50K+ queries per second
Reliability -- 99.9% uptime for core search and listing creation flows, graceful degradation during regional outages
Latency -- p95 search latency under 200ms, listing creation under 500ms, image uploads processed within 10 seconds
Consistency -- eventual consistency acceptable for search index (up to 1-2 minute delay), strong consistency for listing ownership and sold status

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Geospatial Indexing and Query Optimization

Interviewers want to see that you understand specialized data structures for proximity search and won't naively compute distances against every listing in the database. They'll probe how you handle uneven geographic density -- downtown areas with thousands of listings per square kilometer versus rural regions with sparse inventory.

Hints to consider:

Partition the world into hierarchical tiles (quadtree, geohash, S2 cells) to narrow search space before computing exact distances
Use a dedicated search engine like Elasticsearch with native geo_point queries or spatial indexes rather than relying on SQL databases
Consider tiered caching strategies where popular urban grid cells are cached aggressively while rural queries go directly to the index
Handle edge cases like searches spanning multiple tiles, listings near boundaries, and cross-timezone considerations

2. Image Storage and Processing Pipeline

Photos are the primary content in marketplace listings, and interviewers will expect you to articulate a scalable approach to ingestion, storage, and delivery. They'll look for awareness of the different stages -- upload, processing, storage, and serving -- and the tradeoffs between synchronous and asynchronous flows.

Hints to consider:

Decouple image uploads from listing creation using presigned URLs to object storage (S3, GCS) to avoid blocking the write path
Generate multiple thumbnail sizes asynchronously via worker queues to optimize bandwidth for different device types and screen sizes
Store only metadata (URLs, dimensions, file hashes) in the primary database, not the binary data itself
Use a CDN for global image delivery with appropriate cache headers, and consider lazy-loading progressive JPEGs for faster perceived load times

3. Real-Time Index Updates and Cache Invalidation

A critical differentiator for marketplaces is how quickly new listings appear in search results and how stale listings are removed. Interviewers will probe your understanding of event-driven architectures and the tradeoffs between write latency, index freshness, and system complexity.

Hints to consider:

Use an outbox pattern or change data capture to emit listing events (created, updated, sold, deleted) to a message queue like Kafka
Have dedicated indexing workers consume events and update Elasticsearch clusters, decoupling write availability from search indexing
Implement versioning or timestamps to handle out-of-order events and idempotent updates
Invalidate or update geo-tile caches reactively based on the same event stream to maintain consistency between cache and index

4. Handling Geographic Hotspots and Traffic Spikes

Not all locations are equal -- major cities can have 100x the listing density and query volume of rural areas. Interviewers want to see that you anticipate hotspots and have strategies to prevent them from causing cascading failures or degraded experience.

Hints to consider:

Shard the search index by geography (e.g., one shard per metro area or high-level geohash) so hot regions don't overload shared resources
Use multi-level caching where frequently accessed tiles (Manhattan, downtown SF) live in an in-memory cache with higher TTL than cold regions
Rate-limit aggressive scrapers and bots that might slam specific geographic areas
Consider read replicas or federated search to spread load across multiple Elasticsearch clusters when query volume exceeds a single cluster's capacity

Practice/Meta/Design Meta Marketplace

Design Meta Marketplace

System DesignOptional

Problem Statement

Key Requirements

Functional

Listing creation -- sellers can post items with title, description, price, photos, category, and location
Proximity-based search -- buyers can discover listings within a specified radius, filtered by category, price range, and keywords
Listing management -- sellers can edit prices, mark items as sold, delete listings, and upload additional photos
Direct messaging -- buyers and sellers can exchange messages to negotiate price and arrange pickup
Saved searches and alerts -- users can save search criteria and receive notifications when matching listings appear

Non-Functional

Scalability -- support 100M+ active listings globally, 10K+ new listings per minute during peak hours, 50K+ queries per second
Reliability -- 99.9% uptime for core search and listing creation flows, graceful degradation during regional outages
Latency -- p95 search latency under 200ms, listing creation under 500ms, image uploads processed within 10 seconds
Consistency -- eventual consistency acceptable for search index (up to 1-2 minute delay), strong consistency for listing ownership and sold status

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Geospatial Indexing and Query Optimization

Hints to consider:

Partition the world into hierarchical tiles (quadtree, geohash, S2 cells) to narrow search space before computing exact distances
Use a dedicated search engine like Elasticsearch with native geo_point queries or spatial indexes rather than relying on SQL databases
Consider tiered caching strategies where popular urban grid cells are cached aggressively while rural queries go directly to the index
Handle edge cases like searches spanning multiple tiles, listings near boundaries, and cross-timezone considerations

2. Image Storage and Processing Pipeline

Hints to consider:

Decouple image uploads from listing creation using presigned URLs to object storage (S3, GCS) to avoid blocking the write path
Generate multiple thumbnail sizes asynchronously via worker queues to optimize bandwidth for different device types and screen sizes
Store only metadata (URLs, dimensions, file hashes) in the primary database, not the binary data itself
Use a CDN for global image delivery with appropriate cache headers, and consider lazy-loading progressive JPEGs for faster perceived load times

3. Real-Time Index Updates and Cache Invalidation

Hints to consider:

Use an outbox pattern or change data capture to emit listing events (created, updated, sold, deleted) to a message queue like Kafka
Have dedicated indexing workers consume events and update Elasticsearch clusters, decoupling write availability from search indexing
Implement versioning or timestamps to handle out-of-order events and idempotent updates
Invalidate or update geo-tile caches reactively based on the same event stream to maintain consistency between cache and index

4. Handling Geographic Hotspots and Traffic Spikes

Hints to consider:

Shard the search index by geography (e.g., one shard per metro area or high-level geohash) so hot regions don't overload shared resources
Use multi-level caching where frequently accessed tiles (Manhattan, downtown SF) live in an in-memory cache with higher TTL than cold regions
Rate-limit aggressive scrapers and bots that might slam specific geographic areas
Consider read replicas or federated search to spread load across multiple Elasticsearch clusters when query volume exceeds a single cluster's capacity