Design a Video Converter for YouTube — Squarespace

Problem Statement

Design a web service that connects to a user's YouTube account, lists their uploaded videos, lets them select one or more clips and choose visual filters or effects (trim, watermark, color grading), processes the video, and reuploads the filtered result to YouTube. The application tracks which videos have already been processed and what filters were applied so users can review their history and avoid duplicate work.

This problem tests your ability to architect systems that orchestrate long-running computational workloads, manage large binary assets without moving them through your application tier, coordinate multi-step pipelines with external API dependencies, and maintain consistency across distributed components. Interviewers want to see how you reason about asynchronous job processing, failure recovery, OAuth integration, YouTube API quota management, idempotency, and user experience during operations that may take minutes to complete.

Key Requirements

Functional

YouTube account connection -- Users authenticate via OAuth and grant permissions to read their video library and upload on their behalf
Video selection and filter configuration -- Users browse their YouTube videos, select one or more, choose filters and effects, and submit processing jobs
Job progress tracking -- Users see real-time status for each job (queued, downloading, processing, uploading, complete, failed) with progress percentage
Processing history -- Users view which videos have been filtered, what effects were applied, and when, preventing accidental reprocessing of identical configurations

Non-Functional

Scalability -- Handle 10,000 concurrent users submitting 50,000 video editing jobs per day with individual videos up to 5 GB
Reliability -- Ensure at-least-once processing with exactly-once upload semantics; recover from worker crashes, network failures, and YouTube API errors with automatic retries
Latency -- Provide job status updates within 2 seconds of state changes; complete short-video simple-filter jobs in under 60 seconds end-to-end
Consistency -- Strong consistency for job state transitions; prevent race conditions when duplicate submissions target the same source video and filter combination

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Asynchronous Job Processing Architecture

Video processing is compute-intensive and unpredictable in duration. Interviewers want to see how you decouple submission from execution and scale workers independently from the API tier.

Hints to consider:

Use a durable message queue (Kafka, SQS) where the API publishes job definitions and worker pools consume them with visibility timeouts
Workers claim jobs atomically, report progress at each stage, and checkpoint intermediate state so a crash does not require restarting from scratch
Auto-scale worker instances based on queue depth and processing-time percentiles
Generate a unique job ID and content-configuration hash at submission time to serve as the idempotency key

2. Large Video Asset Management

Moving multi-gigabyte files through your web servers or databases is expensive and slow. Interviewers expect workflows that minimize data movement and optimize storage costs.

Hints to consider:

Download videos from YouTube directly to object storage (S3) using streaming transfers rather than buffering in worker memory
Use pre-signed URLs for any client-side interactions with stored files to keep the API tier stateless
Apply lifecycle policies to delete intermediate artifacts (downloaded source, filter output) after the final upload succeeds
Consider content-addressable storage with SHA-256 hashes to deduplicate identical source videos across users

3. YouTube API Quota and Rate Limit Management

YouTube enforces strict daily quota units and per-second rate limits. Designs that ignore these constraints will be throttled or banned, leading to a poor user experience.

Hints to consider:

Implement a distributed rate limiter (token bucket in Redis) that tracks quota consumption globally and per user
Queue API-dependent operations (video list fetches, uploads) in dedicated worker pools separate from CPU-bound filter processing
Use exponential backoff with jitter when the API returns rate-limit errors, and circuit breakers when error rates exceed a threshold
Monitor remaining daily quota and gracefully pause new job acceptance when approaching the limit, notifying users of expected delay

4. Multi-Stage Pipeline Orchestration

Each job spans multiple sequential stages: OAuth validation, metadata fetch, video download, filter application, encoding, quality check, upload, and completion. Interviewers probe how you coordinate these steps and handle partial failures.

Hints to consider:

Model each job as a state machine with explicit transitions (QUEUED, DOWNLOADING, PROCESSING, UPLOADING, COMPLETE, FAILED) stored durably
Persist intermediate results (downloaded file path, processed output path) so recovery after a crash resumes from the last successful stage
Implement compensating actions: if the upload fails after processing, retry the upload rather than re-downloading and reprocessing
Emit progress events at each stage transition so the frontend can render a detailed status timeline

5. Deduplication and Processing History

Users should not accidentally reprocess the same video with identical filters. Interviewers assess your data modeling for enforcing uniqueness and providing a clear history view.

Hints to consider:

Compute a composite key from source video ID and filter configuration hash; enforce uniqueness with a database constraint
When a duplicate submission arrives, return the existing job's result instead of starting a new one
Store job history with timestamps, applied filters, source and output video IDs, and outcome status for audit and replay
Support versioning so that changes to filter definitions create new jobs rather than silently overwriting history

Suggested Approach

Step 1: Clarify Requirements

Confirm expected video sizes, typical filter complexity (simple color adjustments versus multi-pass compositing), and how many concurrent users the system should support. Ask whether real-time preview is needed before committing to a full processing job. Clarify which YouTube API operations are in scope (read library, download, upload) and whether the system must handle other platforms. Determine acceptable processing latency for different video lengths and whether users need email or push notifications on completion.

Step 2: High-Level Architecture

Sketch the major components: an API gateway for user requests and OAuth management, a job orchestrator that validates submissions and publishes to a message queue, a fleet of video processing workers consuming from the queue, object storage holding downloaded sources and processed outputs, a metadata database (PostgreSQL) tracking users, OAuth tokens (encrypted), jobs, and filter history, a progress notification service using server-sent events or WebSocket for real-time updates, and a rate-limiting layer backed by Redis that governs YouTube API calls. Show data flow: user submits job, API creates a job record, publishes to queue, worker claims job, downloads video to S3, applies filters via FFmpeg, uploads result to YouTube, updates job status, and notifies the user.

Step 3: Deep Dive on Job Processing Pipeline

Step 4: Address Secondary Concerns

Cover OAuth token management: store refresh tokens encrypted at rest, rotate them on each refresh, and revoke access cleanly when users disconnect their account. Discuss cost optimization: use spot instances for processing workers, delete intermediate S3 objects via lifecycle rules, and compress processed outputs before upload. Address monitoring: track job throughput by stage, processing duration percentiles, YouTube API error rates, quota consumption, and worker crash frequency. Mention security: validate all inputs to FFmpeg to prevent injection, scan downloaded content for malicious payloads, and enforce per-user job quotas. Briefly discuss scaling: partition the job queue by user_id for fairness, auto-scale workers based on queue depth, and shard the metadata database if job volume outgrows a single instance.

Problem Statement

Key Requirements

Functional

YouTube account connection -- Users authenticate via OAuth and grant permissions to read their video library and upload on their behalf
Video selection and filter configuration -- Users browse their YouTube videos, select one or more, choose filters and effects, and submit processing jobs
Job progress tracking -- Users see real-time status for each job (queued, downloading, processing, uploading, complete, failed) with progress percentage
Processing history -- Users view which videos have been filtered, what effects were applied, and when, preventing accidental reprocessing of identical configurations

Non-Functional

Scalability -- Handle 10,000 concurrent users submitting 50,000 video editing jobs per day with individual videos up to 5 GB
Reliability -- Ensure at-least-once processing with exactly-once upload semantics; recover from worker crashes, network failures, and YouTube API errors with automatic retries
Latency -- Provide job status updates within 2 seconds of state changes; complete short-video simple-filter jobs in under 60 seconds end-to-end
Consistency -- Strong consistency for job state transitions; prevent race conditions when duplicate submissions target the same source video and filter combination

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Asynchronous Job Processing Architecture

Video processing is compute-intensive and unpredictable in duration. Interviewers want to see how you decouple submission from execution and scale workers independently from the API tier.

Hints to consider:

Use a durable message queue (Kafka, SQS) where the API publishes job definitions and worker pools consume them with visibility timeouts
Workers claim jobs atomically, report progress at each stage, and checkpoint intermediate state so a crash does not require restarting from scratch
Auto-scale worker instances based on queue depth and processing-time percentiles
Generate a unique job ID and content-configuration hash at submission time to serve as the idempotency key

2. Large Video Asset Management

Moving multi-gigabyte files through your web servers or databases is expensive and slow. Interviewers expect workflows that minimize data movement and optimize storage costs.

Hints to consider:

Download videos from YouTube directly to object storage (S3) using streaming transfers rather than buffering in worker memory
Use pre-signed URLs for any client-side interactions with stored files to keep the API tier stateless
Apply lifecycle policies to delete intermediate artifacts (downloaded source, filter output) after the final upload succeeds
Consider content-addressable storage with SHA-256 hashes to deduplicate identical source videos across users

3. YouTube API Quota and Rate Limit Management

YouTube enforces strict daily quota units and per-second rate limits. Designs that ignore these constraints will be throttled or banned, leading to a poor user experience.

Hints to consider:

Implement a distributed rate limiter (token bucket in Redis) that tracks quota consumption globally and per user
Queue API-dependent operations (video list fetches, uploads) in dedicated worker pools separate from CPU-bound filter processing
Use exponential backoff with jitter when the API returns rate-limit errors, and circuit breakers when error rates exceed a threshold
Monitor remaining daily quota and gracefully pause new job acceptance when approaching the limit, notifying users of expected delay

4. Multi-Stage Pipeline Orchestration

Hints to consider:

Model each job as a state machine with explicit transitions (QUEUED, DOWNLOADING, PROCESSING, UPLOADING, COMPLETE, FAILED) stored durably
Persist intermediate results (downloaded file path, processed output path) so recovery after a crash resumes from the last successful stage
Implement compensating actions: if the upload fails after processing, retry the upload rather than re-downloading and reprocessing
Emit progress events at each stage transition so the frontend can render a detailed status timeline

5. Deduplication and Processing History

Users should not accidentally reprocess the same video with identical filters. Interviewers assess your data modeling for enforcing uniqueness and providing a clear history view.

Hints to consider:

Compute a composite key from source video ID and filter configuration hash; enforce uniqueness with a database constraint
When a duplicate submission arrives, return the existing job's result instead of starting a new one
Store job history with timestamps, applied filters, source and output video IDs, and outcome status for audit and replay
Support versioning so that changes to filter definitions create new jobs rather than silently overwriting history