Practice/Meta/Design Catalog System
Design Catalog System
Product DesignMust
Problem Statement
You are tasked with designing a marketplace listing system that enables sellers to create product listings by uploading item descriptions, pricing information, and multiple images. The system must automatically screen submissions to detect and block prohibited content such as weapons, counterfeit goods, and other policy violations. Users should receive immediate feedback on whether their listing was accepted or rejected, along with specific reasons for any rejections.
The platform is expected to handle millions of active sellers with hundreds of thousands of new listings created daily. Peak traffic occurs during holiday seasons and promotional events, where submission rates can triple. The system must balance speed of approval with accuracy of content moderation, as false positives frustrate legitimate sellers while false negatives expose the marketplace to legal and reputational risks.
Key Requirements
Functional
- Listing Creation -- sellers can submit products with title, description, price, category, and 5-10 images
- Automated Content Moderation -- system analyzes text and images to detect prohibited items, offensive content, and policy violations
- Real-time Feedback -- sellers receive approval or rejection status within seconds of submission
- Multi-stage Review -- rejected items can be appealed for human review
- Search and Discovery -- approved listings appear in search results and category browsing within minutes
Non-Functional
- Scalability -- support 100K+ listing submissions per hour during peak times
- Reliability -- 99.9% uptime for submission pipeline; no data loss on accepted listings
- Latency -- automated moderation completes within 3-5 seconds; listing appears in search within 2 minutes
- Consistency -- eventual consistency acceptable for search indexing; strong consistency for moderation decisions
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Content Moderation Pipeline Design
The moderation system is the most critical and complex component. Interviewers want to see how you balance accuracy, speed, and cost while handling diverse content types.
Hints to consider:
- Consider a multi-tiered approach combining rule-based filters, ML models, and human review
- Discuss how you would handle image analysis separately from text analysis given different processing times
- Think about queueing strategies to handle traffic spikes without degrading moderation quality
- Address the feedback loop for continuously improving moderation models based on appeal outcomes
2. Storage Architecture for Media Assets
Image handling at scale involves significant storage, bandwidth, and processing considerations that reveal architectural thinking.
Hints to consider:
- Consider object storage solutions for raw uploads versus CDN-optimized versions
- Discuss image preprocessing pipeline for generating thumbnails and multiple resolutions
- Think about how to prevent duplicate uploads and optimize storage costs
- Address backup and disaster recovery for user-uploaded content
3. Handling Asynchronous Processing and User Feedback
The system involves multiple asynchronous steps while maintaining user experience expectations around feedback timing.
Hints to consider:
- Design a state machine for listing status transitions from submission through moderation to publication
- Consider websockets, polling, or push notifications for real-time status updates
- Think about retry mechanisms when downstream services fail during processing
- Address idempotency to handle duplicate submissions from network retries
4. Search Indexing and Consistency Trade-offs
Making listings immediately searchable after approval while maintaining system scalability requires careful consistency decisions.
Hints to consider:
- Discuss event-driven architecture to trigger search indexing upon approval
- Consider whether to use synchronous indexing with increased latency or asynchronous with temporary inconsistency
- Think about cache invalidation strategies for category pages and search results
- Address how to handle listings that are later removed for policy violations