Design LeetCode — Microsoft

Problem Statement

Build an online platform where developers participate in timed programming contests by solving algorithmic challenges. Users submit code in multiple programming languages, which must be compiled, executed against hidden test cases, and scored in real time. The system needs to safely execute untrusted code, maintain fair contest rules, and display live rankings that update as submissions are evaluated.

Your design must handle thousands of concurrent participants submitting solutions during peak contest hours, ensure no malicious code can compromise infrastructure, deliver verdicts within seconds, and maintain leaderboard accuracy even when many users solve problems simultaneously. The platform should support multiple programming languages, enforce contest time windows, prevent cheating through duplicate detection, and provide a smooth user experience for both practice mode and competitive events.

Key Requirements

Functional

Problem catalog -- users browse and filter problems by difficulty, tags, and contest; each problem includes description, constraints, examples, and acceptance rate
Code submission -- users submit solutions in various languages (Python, Java, C++, Go), receive compilation feedback, and see detailed test results with pass/fail status
Timed contests -- contests have strict start and end times, allow multiple submission attempts per problem, apply time penalties for wrong answers, and freeze standings near the end
Live leaderboard -- display top performers with scores and solve times, show user's current rank, and provide paginated views of rankings around the user's position

Non-Functional

Scalability -- handle 50,000 concurrent contest participants with 5-10 submissions per minute per user during peak periods
Reliability -- ensure all submissions are evaluated exactly once, maintain leaderboard consistency across failures, and preserve submission history for dispute resolution
Latency -- return compilation errors within 2 seconds, complete test execution for standard problems within 5 seconds, and update leaderboards within 1 second of verdict
Consistency -- guarantee eventual consistency for leaderboards with conflict resolution for simultaneous submissions, strong consistency for user scores within contests

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Code Execution Safety and Isolation

Executing arbitrary user code is the highest security and reliability risk. Interviewers want to see how you prevent resource exhaustion, sandbox escapes, and cascading failures from malicious or buggy submissions.

Hints to consider:

Use containerized sandboxes with strict CPU, memory, and time limits enforced at the kernel level
Execute code in isolated worker pools separate from API servers, with network access disabled
Implement multiple layers of defense including language-specific restrictions, system call filtering, and disk quotas
Consider using lightweight VMs or gVisor for stronger isolation than standard containers

2. Asynchronous Submission Pipeline

Compilation and test execution are too slow and resource-intensive for synchronous request-response. Interviewers expect a robust async pipeline that handles retries, failures, and backpressure gracefully.

Hints to consider:

Use message queues to decouple submission ingestion from evaluation, allowing horizontal scaling of workers
Implement idempotent processing with submission IDs to prevent double-scoring if workers retry
Design clear state transitions (pending, compiling, running, completed) with timeouts at each stage
Add dead-letter queues for submissions that fail repeatedly and alert operators

3. Real-Time Leaderboard Updates

During active contests, leaderboards must reflect new scores within seconds while supporting thousands of readers. Interviewers look for designs that avoid expensive database queries on every page load.

Hints to consider:

Materialize leaderboards in a fast data structure like Redis sorted sets keyed by contest ID
Update scores atomically as verdicts complete, using the data structure's native operations
Cache top-N slices and individual user ranks separately to serve common queries efficiently
Consider eventual consistency trade-offs and how to handle tie-breaking rules consistently

4. Contest Fairness and Anti-Cheating

Competitive integrity requires preventing late submissions, detecting plagiarism, and ensuring all users are judged by identical test cases. Interviewers want to see how you enforce rules at scale.

Hints to consider:

Validate contest eligibility and time windows at submission time with server-side clock checks
Store test cases separately from problem statements and never expose them to clients
Implement submission fingerprinting or code similarity detection to flag potential plagiarism
Design penalty systems that account for wrong submissions and time taken to solve

Suggested Approach

Step 1: Clarify Requirements

Start by confirming the scale and scope with your interviewer. Ask how many concurrent users the system must support during peak contests and whether practice mode has different requirements. Clarify which programming languages are required and whether custom libraries or dependencies need support. Confirm expectations around verdict latency and whether partial credit exists for test cases. Ask about contest formats -- are there team competitions, different scoring rules, or special features like hacking others' solutions? Understanding anti-cheating priorities will guide your security design.

Step 2: High-Level Architecture

Sketch a system with three main tiers. The API layer handles user requests, serves problem content from cache, accepts submissions, and queries leaderboard state. The evaluation layer consists of worker pools that pull submissions from a queue, compile code in sandboxes, execute test cases with resource limits, and publish verdicts. The storage layer includes a relational database for users, problems, submissions, and contests; a cache for problem statements and user sessions; and an in-memory data structure for live leaderboards. Connect these tiers with a message queue for submission events and a pub-sub channel for real-time verdict notifications to clients.

Step 3: Deep Dive on Submission Evaluation

Walk through the lifecycle of a single submission. When a user submits code, the API server validates contest eligibility, assigns a unique submission ID, persists the record with "pending" status, and publishes a message to the evaluation queue partitioned by problem or priority. A worker claims the message, pulls the code and test cases, spins up an isolated container with CPU and memory cgroups, compiles the code with a timeout, and runs it against each test case sequentially. Each test execution is time-boxed and monitored for illegal system calls. The worker aggregates results (accepted, wrong answer, time limit exceeded), updates the submission record, calculates the user's new score based on contest rules, and atomically updates the leaderboard. If compilation fails or tests timeout, the worker marks the submission accordingly without retrying. Discuss how partitioning the queue by problem allows fair resource allocation and prevents one problem from starving others.

Step 4: Address Secondary Concerns

Cover how you'll implement the live leaderboard. Use a sorted set in Redis keyed by contest ID, with user IDs as members and composite scores as values. When a verdict completes, atomically increment the user's score and re-sort. For reads, fetch top-N using range queries and find a user's rank with rank lookup operations, both O(log N). Discuss caching strategies for problem statements and how to invalidate them when admins update content. Explain database schema choices -- normalize users, contests, problems, and submissions with foreign keys for referential integrity and indexes on contest_id and user_id for common queries. Address monitoring by logging submission rates, queue depths, worker utilization, and leaderboard update latencies. Mention rate limiting on submissions per user to prevent abuse and how you'd handle dispute resolution by storing complete test outputs for admin review.

Problem Statement

Key Requirements

Functional

Problem catalog -- users browse and filter problems by difficulty, tags, and contest; each problem includes description, constraints, examples, and acceptance rate
Code submission -- users submit solutions in various languages (Python, Java, C++, Go), receive compilation feedback, and see detailed test results with pass/fail status
Timed contests -- contests have strict start and end times, allow multiple submission attempts per problem, apply time penalties for wrong answers, and freeze standings near the end
Live leaderboard -- display top performers with scores and solve times, show user's current rank, and provide paginated views of rankings around the user's position

Non-Functional

Scalability -- handle 50,000 concurrent contest participants with 5-10 submissions per minute per user during peak periods
Reliability -- ensure all submissions are evaluated exactly once, maintain leaderboard consistency across failures, and preserve submission history for dispute resolution
Latency -- return compilation errors within 2 seconds, complete test execution for standard problems within 5 seconds, and update leaderboards within 1 second of verdict
Consistency -- guarantee eventual consistency for leaderboards with conflict resolution for simultaneous submissions, strong consistency for user scores within contests

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Code Execution Safety and Isolation

Hints to consider:

Use containerized sandboxes with strict CPU, memory, and time limits enforced at the kernel level
Execute code in isolated worker pools separate from API servers, with network access disabled
Implement multiple layers of defense including language-specific restrictions, system call filtering, and disk quotas
Consider using lightweight VMs or gVisor for stronger isolation than standard containers

2. Asynchronous Submission Pipeline

Hints to consider:

Use message queues to decouple submission ingestion from evaluation, allowing horizontal scaling of workers
Implement idempotent processing with submission IDs to prevent double-scoring if workers retry
Design clear state transitions (pending, compiling, running, completed) with timeouts at each stage
Add dead-letter queues for submissions that fail repeatedly and alert operators

3. Real-Time Leaderboard Updates

Hints to consider:

Materialize leaderboards in a fast data structure like Redis sorted sets keyed by contest ID
Update scores atomically as verdicts complete, using the data structure's native operations
Cache top-N slices and individual user ranks separately to serve common queries efficiently
Consider eventual consistency trade-offs and how to handle tie-breaking rules consistently

4. Contest Fairness and Anti-Cheating

Competitive integrity requires preventing late submissions, detecting plagiarism, and ensuring all users are judged by identical test cases. Interviewers want to see how you enforce rules at scale.

Hints to consider:

Validate contest eligibility and time windows at submission time with server-side clock checks
Store test cases separately from problem statements and never expose them to clients
Implement submission fingerprinting or code similarity detection to flag potential plagiarism
Design penalty systems that account for wrong submissions and time taken to solve