Build an online platform where developers participate in timed programming contests by solving algorithmic challenges. Users submit code in multiple programming languages, which must be compiled, executed against hidden test cases, and scored in real time. The system displays live rankings that update as submissions are evaluated and supports both individual practice and competitive events with enforced time windows.
The design must safely execute untrusted code from thousands of concurrent participants without compromising infrastructure, deliver verdicts within seconds, and maintain leaderboard accuracy when many users solve problems simultaneously. Key challenges include sandboxing arbitrary code, scaling the evaluation pipeline under bursty contest traffic, preventing double-scoring on retries, and ensuring fairness through consistent test environments and anti-cheating measures.
Based on real interview experiences, these are the areas interviewers probe most deeply:
Executing arbitrary user code is the highest security and reliability risk. Interviewers want to see how you prevent resource exhaustion, sandbox escapes, and cascading failures from malicious or buggy submissions.
Hints to consider:
Compilation and test execution are too slow for synchronous request-response. Interviewers expect a robust async pipeline that handles retries, failures, and backpressure without losing submissions or double-scoring.
Hints to consider:
During active contests, leaderboards must reflect new scores within seconds while supporting thousands of concurrent readers. Recomputing rankings from the database on every request is too expensive.
Hints to consider:
Competitive integrity requires preventing late submissions, detecting plagiarism, and ensuring all users are judged by identical test cases. Interviewers look for how you enforce rules at scale.
Hints to consider:
Confirm how many concurrent users the system must support during peak contests and whether practice mode has different scaling requirements. Clarify which programming languages are required and whether custom libraries need support. Establish expectations around verdict latency and whether partial credit exists for individual test cases. Ask about contest formats, team competitions, and anti-cheating priorities.
Sketch a system with three tiers. The API layer handles user requests, serves problem content from cache, accepts submissions, and queries leaderboard state. The evaluation layer consists of worker pools that pull submissions from a message queue, compile code in sandboxes, execute test cases with resource limits, and publish verdicts. The storage layer includes PostgreSQL for users, problems, submissions, and contest metadata; Redis for live leaderboards and problem content caching; and Kafka for submission events and verdict notifications. Connect these tiers with the message queue for submission flow and a pub-sub channel for pushing real-time verdict notifications to clients via SSE or WebSocket.
Walk through a submission lifecycle. The user submits code and the API server validates contest eligibility, assigns a unique submission ID, persists the record with Pending status, and publishes a message to the evaluation queue partitioned by problem ID. A worker claims the message, pulls the code and test cases, spins up an isolated container with CPU and memory cgroups, compiles the code with a timeout, and runs it against each test case sequentially. Each test is time-boxed and monitored for illegal system calls. The worker aggregates results (Accepted, Wrong Answer, Time Limit Exceeded), updates the submission record, calculates the user's new score based on contest rules, and atomically updates the Redis leaderboard. Discuss how partitioning the queue by problem ID distributes load fairly and prevents one popular problem from starving workers processing others.
Cover how the leaderboard serves reads: top-N queries use Redis ZREVRANGE, and "around me" queries use ZRANK plus a range fetch. Discuss caching strategies for problem statements and how to invalidate them when admins update content. Explain the database schema: normalize users, contests, problems, and submissions with indexes on contest_id and user_id for efficient queries. Address monitoring by tracking submission rates, queue depths, worker utilization, and verdict latency. Mention rate limiting on submissions per user to prevent abuse. Touch on dispute resolution by storing complete test outputs for admin review.