Design a web service that connects to a user's YouTube account, lists their uploaded videos, lets them select one or more clips and choose visual filters or effects (trim, watermark, color grading), processes the video, and reuploads the filtered result to YouTube. The application tracks which videos have already been processed and what filters were applied so users can review their history and avoid duplicate work.
This problem tests your ability to architect systems that orchestrate long-running computational workloads, manage large binary assets without moving them through your application tier, coordinate multi-step pipelines with external API dependencies, and maintain consistency across distributed components. Interviewers want to see how you reason about asynchronous job processing, failure recovery, OAuth integration, YouTube API quota management, idempotency, and user experience during operations that may take minutes to complete.
Based on real interview experiences, these are the areas interviewers probe most deeply:
Video processing is compute-intensive and unpredictable in duration. Interviewers want to see how you decouple submission from execution and scale workers independently from the API tier.
Hints to consider:
Moving multi-gigabyte files through your web servers or databases is expensive and slow. Interviewers expect workflows that minimize data movement and optimize storage costs.
Hints to consider:
YouTube enforces strict daily quota units and per-second rate limits. Designs that ignore these constraints will be throttled or banned, leading to a poor user experience.
Hints to consider:
Each job spans multiple sequential stages: OAuth validation, metadata fetch, video download, filter application, encoding, quality check, upload, and completion. Interviewers probe how you coordinate these steps and handle partial failures.
Hints to consider:
Users should not accidentally reprocess the same video with identical filters. Interviewers assess your data modeling for enforcing uniqueness and providing a clear history view.
Hints to consider:
Confirm expected video sizes, typical filter complexity (simple color adjustments versus multi-pass compositing), and how many concurrent users the system should support. Ask whether real-time preview is needed before committing to a full processing job. Clarify which YouTube API operations are in scope (read library, download, upload) and whether the system must handle other platforms. Determine acceptable processing latency for different video lengths and whether users need email or push notifications on completion.
Sketch the major components: an API gateway for user requests and OAuth management, a job orchestrator that validates submissions and publishes to a message queue, a fleet of video processing workers consuming from the queue, object storage holding downloaded sources and processed outputs, a metadata database (PostgreSQL) tracking users, OAuth tokens (encrypted), jobs, and filter history, a progress notification service using server-sent events or WebSocket for real-time updates, and a rate-limiting layer backed by Redis that governs YouTube API calls. Show data flow: user submits job, API creates a job record, publishes to queue, worker claims job, downloads video to S3, applies filters via FFmpeg, uploads result to YouTube, updates job status, and notifies the user.
Cover OAuth token management: store refresh tokens encrypted at rest, rotate them on each refresh, and revoke access cleanly when users disconnect their account. Discuss cost optimization: use spot instances for processing workers, delete intermediate S3 objects via lifecycle rules, and compress processed outputs before upload. Address monitoring: track job throughput by stage, processing duration percentiles, YouTube API error rates, quota consumption, and worker crash frequency. Mention security: validate all inputs to FFmpeg to prevent injection, scan downloaded content for malicious payloads, and enforce per-user job quotas. Briefly discuss scaling: partition the job queue by user_id for fairness, auto-scale workers based on queue depth, and shard the metadata database if job volume outgrows a single instance.
Walk through a single job lifecycle. The orchestrator generates a job ID and computes a deduplication hash from the source video ID and filter configuration. It checks the history table: if a matching completed job exists, it returns the cached result immediately. Otherwise, it inserts a QUEUED job record and publishes a message. A worker picks up the message, transitions the job to DOWNLOADING, and streams the video from YouTube to S3. On completion, it transitions to PROCESSING, invokes FFmpeg with the selected filters, and writes the output to a new S3 key. It then transitions to UPLOADING, calls the YouTube upload API using the user's OAuth refresh token, respecting the global rate limiter. On success, it records the output video ID, transitions to COMPLETE, and emits a notification event. On any failure, the worker increments a retry counter, logs the error, and requeues the message with exponential backoff. After maximum retries, the job moves to FAILED and the user is notified with an error description.