Design LeetCode — NVIDIA

Reference Answer

For a full example answer with detailed architecture diagrams and deep dives, see our Design LeetCode guide.

Also review the Message Queues and Databases building blocks for background on asynchronous job processing and transactional storage patterns.

Problem Statement

Design a competitive online coding platform where developers solve algorithmic challenges, submit solutions in multiple programming languages, and receive automated verdicts. The system compiles and executes untrusted user code inside secure sandboxes, runs it against hidden test suites, and reports results such as Accepted, Wrong Answer, Time Limit Exceeded, or Runtime Error. Users can practice individually or compete in timed contests with live leaderboards that update as verdicts arrive.

Your design must safely execute thousands of concurrent submissions during peak contest windows, deliver verdicts within seconds, prevent cheating through code-similarity detection, and maintain accurate rankings even when many participants solve the same problem simultaneously. The platform should support at least ten programming languages, enforce strict resource limits on every execution, and store a durable history of all submissions for replay and dispute resolution.

Key Requirements

Functional

Problem catalog -- users browse and filter problems by difficulty, topic tags, and contest; each problem includes a statement, constraints, example inputs/outputs, and acceptance rate
Multi-language code submission -- users submit solutions in Python, Java, C++, Go, Rust, and others; the system compiles, executes against hidden test cases, and returns detailed per-case pass/fail results
Timed contests -- contests enforce strict start and end times, allow multiple attempts per problem, apply time-based penalties for wrong answers, and optionally freeze standings near the end
Live leaderboard -- display rankings sorted by problems solved and penalty time, show individual rank lookups, and support paginated views around any user's position

Non-Functional

Scalability -- handle 50,000 concurrent contest participants each submitting 5-10 solutions per minute during peak windows
Reliability -- guarantee every submission is evaluated exactly once with no lost work, even during worker failures or rolling deployments
Latency -- return compilation errors within 2 seconds, complete test execution within 5 seconds for standard problems, and propagate leaderboard updates within 1 second of a verdict
Consistency -- strong consistency for individual user scores within a contest; eventual consistency for global leaderboard views with sub-second convergence

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Secure Code Execution and Sandboxing

Executing arbitrary user code is the single highest-risk component. Interviewers want to understand how you isolate untrusted programs, enforce resource limits, and prevent sandbox escapes or denial-of-service attacks against the judging infrastructure.

Hints to consider:

Run each submission inside an ephemeral container or lightweight VM (gVisor, Firecracker) with CPU, memory, and wall-clock limits enforced at the kernel level via cgroups
Disable network access, restrict filesystem mounts to read-only problem data, and filter system calls with seccomp profiles
Deploy execution workers on a dedicated pool physically separated from API servers so a rogue process cannot reach user data
Pre-pull language runtime images and warm a pool of idle sandboxes to reduce cold-start latency during contest spikes

2. Asynchronous Submission Pipeline

Compilation and test execution are too slow for synchronous request-response. Interviewers expect a queue-backed pipeline that decouples ingestion from evaluation and handles backpressure, retries, and partial failures gracefully.

Hints to consider:

Publish each submission to a partitioned message queue (Kafka or SQS) keyed by problem ID so one slow problem does not starve others
Workers claim messages, execute the code, and write verdicts to the database with idempotent updates using the submission ID as a deduplication key
Model clear state transitions (Pending, Compiling, Running, Judged, Error) with timeouts at each stage; if a worker crashes mid-execution the message reappears after the visibility timeout
Add a dead-letter queue for submissions that fail repeatedly and alert operators with structured logs

3. Real-Time Leaderboard at Scale

During active contests, thousands of clients poll for ranking updates. Interviewers look for a design that avoids expensive database scans on every request while keeping scores fresh within seconds.

Hints to consider:

Materialize the leaderboard in a Redis sorted set keyed by contest ID, with composite scores encoding problems-solved and penalty time
When a verdict completes, atomically update the user's score with ZADD; serve top-N queries with ZREVRANGE and rank lookups with ZREVRANK, both O(log N)
Cache slices of the leaderboard at the API layer with short TTLs (1-2 seconds) to absorb read amplification during peak moments
Broadcast score-change events over WebSocket channels so connected clients receive push updates instead of polling

4. Contest Integrity and Anti-Cheating

Competitive fairness requires preventing late submissions, detecting plagiarism, and ensuring identical test data for all participants. Interviewers want concrete mechanisms rather than vague policy statements.

Hints to consider:

Validate contest eligibility and enforce submission windows using server-side UTC timestamps; reject anything outside the window regardless of client clock
Store test cases in a separate encrypted store, never expose them through any client-facing API, and rotate inputs across contests to limit data leakage
Run an offline code-similarity pipeline (token-based fingerprinting or AST comparison) after each contest to flag suspicious pairs for manual review
Rate-limit submissions per user per problem to prevent brute-force solution enumeration

Suggested Approach

Step 1: Clarify Requirements

Start by confirming scope with your interviewer. Ask how many concurrent users the system must support during peak contests and whether practice mode has different SLAs. Clarify which programming languages are required and whether custom libraries or large dependencies need support. Confirm acceptable verdict latency and whether partial credit (per-test-case scoring) exists. Ask about contest formats -- individual only, or team contests with shared submissions? Understanding anti-cheating priorities will shape your security design.

Step 2: High-Level Architecture

Sketch three tiers. The API layer serves problem content from cache, accepts submissions, authenticates users, and streams leaderboard data over WebSockets. The evaluation layer consists of stateless worker pools that pull submissions from a partitioned message queue, compile and execute code inside sandboxes with strict cgroups, and publish verdicts. The storage layer includes PostgreSQL for users, problems, contests, and submission records; Redis for live leaderboards and session state; S3 for archival of submission source code and test-case bundles; and Kafka as the durable message backbone connecting the API and evaluation layers.

Step 3: Deep Dive on Submission Lifecycle

Walk through a single submission end-to-end. A user submits code; the API server validates contest eligibility, assigns a UUID, persists the record as Pending, and publishes a message to Kafka partitioned by problem ID. A worker consumes the message, fetches the source and test-case bundle, spins up a Firecracker microVM with language-specific toolchain, compiles with a 10-second timeout, then runs each test case sequentially with per-case CPU and memory limits. The worker aggregates results, writes a verdict row, updates the user's contest score atomically in Redis, and publishes a score-change event to a WebSocket fan-out service. If the worker crashes, Kafka's consumer-group rebalance makes the message available to another worker after the visibility timeout. Discuss how partitioning by problem prevents a single popular problem from monopolizing all workers.

Step 4: Address Secondary Concerns

Cover monitoring: track submission queue depth, worker utilization, verdict latency percentiles, and sandbox failure rates. Explain database indexing on (contest_id, user_id) and (problem_id, status) for common query patterns. Discuss scaling strategy -- auto-scale the worker pool based on queue depth during contests, and drain gracefully by stopping new message consumption while finishing in-flight executions. Address data retention: archive old submissions to S3 with lifecycle policies, keep recent contest data in PostgreSQL with partitioned tables by contest date. Finally, mention rate limiting on the submission endpoint, CDN caching for problem statements, and read replicas for the problem catalog to handle browsing traffic independently of contest load.

Reference Answer

For a full example answer with detailed architecture diagrams and deep dives, see our Design LeetCode guide.

Also review the Message Queues and Databases building blocks for background on asynchronous job processing and transactional storage patterns.

Problem Statement

Key Requirements

Functional

Problem catalog -- users browse and filter problems by difficulty, topic tags, and contest; each problem includes a statement, constraints, example inputs/outputs, and acceptance rate
Multi-language code submission -- users submit solutions in Python, Java, C++, Go, Rust, and others; the system compiles, executes against hidden test cases, and returns detailed per-case pass/fail results
Timed contests -- contests enforce strict start and end times, allow multiple attempts per problem, apply time-based penalties for wrong answers, and optionally freeze standings near the end
Live leaderboard -- display rankings sorted by problems solved and penalty time, show individual rank lookups, and support paginated views around any user's position

Non-Functional

Scalability -- handle 50,000 concurrent contest participants each submitting 5-10 solutions per minute during peak windows
Reliability -- guarantee every submission is evaluated exactly once with no lost work, even during worker failures or rolling deployments
Latency -- return compilation errors within 2 seconds, complete test execution within 5 seconds for standard problems, and propagate leaderboard updates within 1 second of a verdict
Consistency -- strong consistency for individual user scores within a contest; eventual consistency for global leaderboard views with sub-second convergence

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Secure Code Execution and Sandboxing

Hints to consider:

Run each submission inside an ephemeral container or lightweight VM (gVisor, Firecracker) with CPU, memory, and wall-clock limits enforced at the kernel level via cgroups
Disable network access, restrict filesystem mounts to read-only problem data, and filter system calls with seccomp profiles
Deploy execution workers on a dedicated pool physically separated from API servers so a rogue process cannot reach user data
Pre-pull language runtime images and warm a pool of idle sandboxes to reduce cold-start latency during contest spikes

2. Asynchronous Submission Pipeline

Hints to consider:

Publish each submission to a partitioned message queue (Kafka or SQS) keyed by problem ID so one slow problem does not starve others
Workers claim messages, execute the code, and write verdicts to the database with idempotent updates using the submission ID as a deduplication key
Model clear state transitions (Pending, Compiling, Running, Judged, Error) with timeouts at each stage; if a worker crashes mid-execution the message reappears after the visibility timeout
Add a dead-letter queue for submissions that fail repeatedly and alert operators with structured logs

3. Real-Time Leaderboard at Scale

During active contests, thousands of clients poll for ranking updates. Interviewers look for a design that avoids expensive database scans on every request while keeping scores fresh within seconds.

Hints to consider:

Materialize the leaderboard in a Redis sorted set keyed by contest ID, with composite scores encoding problems-solved and penalty time
When a verdict completes, atomically update the user's score with ZADD; serve top-N queries with ZREVRANGE and rank lookups with ZREVRANK, both O(log N)
Cache slices of the leaderboard at the API layer with short TTLs (1-2 seconds) to absorb read amplification during peak moments
Broadcast score-change events over WebSocket channels so connected clients receive push updates instead of polling

4. Contest Integrity and Anti-Cheating

Hints to consider:

Validate contest eligibility and enforce submission windows using server-side UTC timestamps; reject anything outside the window regardless of client clock
Store test cases in a separate encrypted store, never expose them through any client-facing API, and rotate inputs across contests to limit data leakage
Run an offline code-similarity pipeline (token-based fingerprinting or AST comparison) after each contest to flag suspicious pairs for manual review
Rate-limit submissions per user per problem to prevent brute-force solution enumeration