Practice/Meta/Design a System to Identify and block Weapons on Marketplace

Design a System to Identify and block Weapons on Marketplace

ML System DesignMust

Problem Statement

You're tasked with designing an automated content moderation system for a large-scale online marketplace that processes millions of product listings daily. The system must detect and prevent users from listing prohibited items, with a specific focus on weapons, firearms, explosives, and related accessories. The challenge is to build a system that can accurately identify violations across text descriptions, product titles, images, and user behavior patterns while minimizing false positives that could frustrate legitimate sellers. The system needs to handle real-time moderation for new listings, batch processing for existing inventory, and continuous learning to adapt to evolving evasion tactics.

Your solution should balance precision (avoiding false flags on legitimate items like toy replicas or collectibles) with recall (catching actual policy violations), while maintaining sub-second latency for the user experience and handling peak traffic of 100,000+ listings per minute during major sales events.

Key Requirements

Functional

Automated Detection -- Identify prohibited weapons and related items across multiple modalities including text, images, and metadata
Real-time Moderation -- Block policy-violating listings before they go live on the marketplace
Multi-language Support -- Detect violations in product descriptions across different languages and regional marketplaces
Appeals Workflow -- Allow legitimate sellers to contest false positives with human review
Pattern Recognition -- Identify coordinated sellers attempting to circumvent filters through coded language or fragmented listings

Non-Functional

Scalability -- Handle 100,000+ listing submissions per minute with ability to scale during peak periods
Reliability -- Maintain 99.9% uptime with graceful degradation if ML models fail
Latency -- Complete moderation checks within 500ms for real-time listings to avoid user friction
Consistency -- Ensure uniform policy enforcement across all regions while respecting local regulations

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Multi-Modal ML Pipeline Design

Interviewers want to see how you orchestrate multiple detection mechanisms (text classifiers, image recognition, behavioral signals) into a cohesive system. They'll probe your understanding of when to run models in parallel versus sequentially, and how to combine confidence scores.

Hints to consider:

Consider using ensemble methods where different models vote on the classification outcome
Think about early-exit strategies where high-confidence detections skip expensive secondary checks
Discuss how to handle cases where text says "camping knife" but image shows an assault rifle
Consider using embedding-based similarity to catch variations of known prohibited terms

2. False Positive Management

Your design must minimize frustration for legitimate sellers while maintaining safety. Interviewers will challenge you on the precision-recall tradeoff and how you'd tune the system over time.

Hints to consider:

Implement confidence thresholds where borderline cases get human review instead of automatic rejection
Track seller reputation and listing history to adjust sensitivity per user
Build an active learning loop where human reviewers' decisions retrain models
Consider contextual signals like category (sporting goods vs electronics) to adjust thresholds

3. Evasion Tactics and Adversarial Resilience

Bad actors constantly evolve their techniques to bypass filters. Interviewers want to see how you'd build an adaptive system that stays ahead of creative workarounds.

Hints to consider:

Discuss detecting deliberate misspellings, leetspeak, and character substitution (e.g., "w3ap0n")
Consider analyzing fragmented listings where seller splits one weapon across multiple innocent-looking posts
Track cross-listing patterns and user networks to identify coordinated circumvention attempts
Implement image perturbation detection to catch cases where sellers add noise to fool vision models

4. Real-Time Processing Architecture

The system must make decisions quickly enough not to disrupt the user flow, but thoroughly enough to be effective. Interviewers will probe your understanding of streaming architectures and model serving.

Hints to consider:

Use lightweight models for initial screening with heavier models triggered only for suspicious cases
Consider caching model predictions for similar listings or images (perceptual hashing)
Discuss async vs sync review flows -- what can be flagged post-publication vs must block immediately
Think about circuit breakers and fallback strategies when ML services are overloaded

5. Compliance and Regional Variations

Weapon policies vary dramatically by country and even by state. Your design must be flexible enough to enforce different rules in different jurisdictions while remaining maintainable.

Hints to consider:

Build a policy engine separate from detection logic so rules can be updated without model retraining
Consider that some items are legal in certain regions (airsoft guns in Japan, knives in hunting regions)
Discuss how to handle cross-border listings and international shipping implications
Think about audit trails and explainability for regulatory compliance

Practice/Meta/Design a System to Identify and block Weapons on Marketplace

Design a System to Identify and block Weapons on Marketplace

ML System DesignMust

Problem Statement

Key Requirements

Functional

Automated Detection -- Identify prohibited weapons and related items across multiple modalities including text, images, and metadata
Real-time Moderation -- Block policy-violating listings before they go live on the marketplace
Multi-language Support -- Detect violations in product descriptions across different languages and regional marketplaces
Appeals Workflow -- Allow legitimate sellers to contest false positives with human review
Pattern Recognition -- Identify coordinated sellers attempting to circumvent filters through coded language or fragmented listings

Non-Functional

Scalability -- Handle 100,000+ listing submissions per minute with ability to scale during peak periods
Reliability -- Maintain 99.9% uptime with graceful degradation if ML models fail
Latency -- Complete moderation checks within 500ms for real-time listings to avoid user friction
Consistency -- Ensure uniform policy enforcement across all regions while respecting local regulations

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Multi-Modal ML Pipeline Design

Hints to consider:

Consider using ensemble methods where different models vote on the classification outcome
Think about early-exit strategies where high-confidence detections skip expensive secondary checks
Discuss how to handle cases where text says "camping knife" but image shows an assault rifle
Consider using embedding-based similarity to catch variations of known prohibited terms

2. False Positive Management

Your design must minimize frustration for legitimate sellers while maintaining safety. Interviewers will challenge you on the precision-recall tradeoff and how you'd tune the system over time.

Hints to consider:

Implement confidence thresholds where borderline cases get human review instead of automatic rejection
Track seller reputation and listing history to adjust sensitivity per user
Build an active learning loop where human reviewers' decisions retrain models
Consider contextual signals like category (sporting goods vs electronics) to adjust thresholds

3. Evasion Tactics and Adversarial Resilience

Bad actors constantly evolve their techniques to bypass filters. Interviewers want to see how you'd build an adaptive system that stays ahead of creative workarounds.

Hints to consider:

Discuss detecting deliberate misspellings, leetspeak, and character substitution (e.g., "w3ap0n")
Consider analyzing fragmented listings where seller splits one weapon across multiple innocent-looking posts
Track cross-listing patterns and user networks to identify coordinated circumvention attempts
Implement image perturbation detection to catch cases where sellers add noise to fool vision models

4. Real-Time Processing Architecture

Hints to consider:

Use lightweight models for initial screening with heavier models triggered only for suspicious cases
Consider caching model predictions for similar listings or images (perceptual hashing)
Discuss async vs sync review flows -- what can be flagged post-publication vs must block immediately
Think about circuit breakers and fallback strategies when ML services are overloaded

5. Compliance and Regional Variations

Weapon policies vary dramatically by country and even by state. Your design must be flexible enough to enforce different rules in different jurisdictions while remaining maintainable.

Hints to consider:

Build a policy engine separate from detection logic so rules can be updated without model retraining
Consider that some items are legal in certain regions (airsoft guns in Japan, knives in hunting regions)
Discuss how to handle cross-border listings and international shipping implications
Think about audit trails and explainability for regulatory compliance