Design a URL Shortener — ServiceTitan

Problem Statement

Design a URL shortening service similar to TinyURL or Bitly that allows users to convert long URLs into short, shareable links and manage their shortened URLs. A user pastes a long URL, gets a compact code like https://sho.rt/Ab3Cd, and anyone hitting that short link is redirected to the original address.

The system must support creating short links, redirecting users with low latency, managing links through an authenticated dashboard (view, disable, delete), and providing basic analytics (total clicks, time series, referrers, country/region). This looks deceptively simple but exercises core distributed systems skills: globally unique ID generation without collisions, extreme read scaling for redirects, low-latency edge serving, write-heavy analytics capture, abuse prevention, and solid data modeling.

At ServiceTitan scale, interviewers want to see crisp requirement definition, realistic scale estimation, the right storage and caching strategy, and pragmatic trade-offs around availability, consistency, and cost.

Key Requirements

Functional

Short link creation -- users submit a long URL and receive a unique, compact short code that maps to the original destination
Low-latency redirection -- anyone visiting a short link is redirected to the original URL within milliseconds, globally
Link management -- authenticated users can view, disable, delete, and optionally update the destination of their short links
Analytics -- users can see basic engagement data for each link including total clicks, time series, referrer sources, and geographic breakdown

Non-Functional

Scalability -- handle hundreds of millions of redirects per day with a much smaller volume of link creations; read-to-write ratio in the hundreds-to-one range
Latency -- P99 redirect latency under 50ms by serving from edge caches; creation latency under 500ms
Reliability -- 99.99% availability for the redirect path; no data loss on link creation
Consistency -- newly created links must be resolvable within seconds; analytics can be eventually consistent

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Short Code Generation and Collision Avoidance

The core ID generation strategy determines much of the system's scalability and correctness. Naive approaches like a single auto-increment counter create contention and single points of failure.

Hints to consider:

Use base62 encoding (a-z, A-Z, 0-9) for compact URL-safe codes; 7 characters give 62^7 (about 3.5 trillion) possibilities
Consider pre-allocated counter ranges per node, hash-based generation with collision detection, or a dedicated ID service like Snowflake
Discuss why base62 over base64 (avoiding URL-unsafe characters + and /)
Implement conditional writes or uniqueness checks to guarantee no two URLs share the same short code

2. Read Path Optimization and Caching

Redirect traffic is overwhelmingly read-heavy. Missing latency targets or hammering the primary datastore on every redirect is a critical red flag.

Hints to consider:

Deploy a multi-tier cache: CDN edge cache for the most popular links, a Redis layer for warm lookups, and the primary datastore as the fallback
Use HTTP 301 (permanent) vs 302 (temporary) redirects strategically based on whether you need analytics on every hit
Implement negative caching for non-existent short codes to protect the database from abuse traffic
Consider geo-distributed cache nodes or DynamoDB Global Tables for cross-region low latency

3. Analytics Ingestion Without Impacting Redirects

Coupling analytics writes to the redirect path raises tail latency and creates failure correlation. A backlog or partial outage in analytics should never break redirects.

Hints to consider:

Fire click events asynchronously to a message queue (Kafka) from the redirect service, then process downstream
Use append-only event streams for raw click data and batch-aggregate into time-series tables for dashboard queries
Consider approximate counters (HyperLogLog for unique visitors) to reduce write amplification
Separate the analytics read path (dashboards) from the redirect read path entirely

4. Abuse Prevention and Rate Limiting

URL shorteners are frequent targets for spam, phishing, and denial-of-service. Interviewers want to see you think about operational safety.

Hints to consider:

Rate limit link creation per user or IP to prevent mass generation of malicious links
Scan destination URLs against blocklists and safe-browsing APIs before creating a short link
Implement per-short-code rate limiting on the redirect path to mitigate DDoS amplification
Provide a reporting mechanism to flag and disable abusive links quickly

5. Data Model and Storage Choices

The core mapping from short code to URL is a classic key-value problem, but the full data model includes user ownership, metadata, and analytics.

Hints to consider:

A key-value store (DynamoDB) is ideal for the code-to-URL mapping: predictable low-latency, auto-scaling, conditional writes for uniqueness, and optional Global Tables for multi-region
Store link metadata (owner, creation time, expiration, status) alongside the mapping or in a relational store depending on query patterns
Use TTL features for links with expiration dates to automate cleanup
Keep analytics data in a separate time-series or columnar store optimized for aggregation queries

Suggested Approach

Step 1: Clarify Requirements

Confirm the expected scale: how many link creations per day, how many redirects per day, and what read-to-write ratio to design for. Ask whether custom aliases are supported (users choosing their own short code). Clarify the analytics depth: just click counts, or full breakdowns by time, geography, and referrer. Confirm whether links can expire and whether the system needs to support bulk creation (API-driven). Ask about geographic distribution of users to inform CDN and multi-region decisions.

Step 2: High-Level Architecture

Sketch the core components: an API Gateway for authentication and rate limiting, a Link Service handling creation and management, a Redirect Service optimized for speed, a Cache Layer (Redis plus CDN), a primary datastore (DynamoDB) for the code-to-URL mapping and metadata, a message queue (Kafka) for analytics event ingestion, and an Analytics Service backed by a time-series store. Show the write path (create link, generate code, persist, return URL) and the read path (CDN check, Redis check, database lookup, redirect, emit analytics event). Highlight that the redirect path is the hot path and must be as thin as possible.

Step 3: Deep Dive on Code Generation and Read Path

Walk through the full lifecycle of a redirect. A client hits the CDN with a short URL. If the edge has a cached redirect, it responds immediately (sub-10ms). On a cache miss, the request reaches the Redirect Service, which checks Redis, then DynamoDB if needed. The redirect is served and the mapping is cached at both layers. Simultaneously, a click event is published to Kafka. Discuss code generation in detail: show how a distributed counter or hash-based scheme avoids hotspots and single points of failure. Explain how conditional writes in DynamoDB guarantee uniqueness without distributed locking.

Step 4: Address Secondary Concerns

Cover the analytics pipeline: Kafka consumers aggregate click events into per-link counters and time-series buckets, stored in a columnar database or time-series store. Discuss cache invalidation when a user disables or deletes a link: invalidate CDN and Redis entries, and serve a 404 or gone page. Address link expiration with DynamoDB TTL triggering cleanup of cache entries. Cover monitoring: track redirect latency percentiles, cache hit rates, creation throughput, and Kafka consumer lag. Discuss cost optimization: most redirects should be served from CDN, minimizing origin hits and database reads.

Related Learning

Deepen your understanding of the patterns used in this problem:

Distributed Counters -- the analytics counting challenge at scale mirrors click tracking
Caching -- multi-tier caching is the key to low-latency redirects
Content Delivery Networks (CDN) -- serving redirects at the edge globally
Message Queues -- Kafka decouples the redirect hot path from analytics ingestion

Problem Statement

Key Requirements

Functional

Short link creation -- users submit a long URL and receive a unique, compact short code that maps to the original destination
Low-latency redirection -- anyone visiting a short link is redirected to the original URL within milliseconds, globally
Link management -- authenticated users can view, disable, delete, and optionally update the destination of their short links
Analytics -- users can see basic engagement data for each link including total clicks, time series, referrer sources, and geographic breakdown

Non-Functional

Scalability -- handle hundreds of millions of redirects per day with a much smaller volume of link creations; read-to-write ratio in the hundreds-to-one range
Latency -- P99 redirect latency under 50ms by serving from edge caches; creation latency under 500ms
Reliability -- 99.99% availability for the redirect path; no data loss on link creation
Consistency -- newly created links must be resolvable within seconds; analytics can be eventually consistent

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Short Code Generation and Collision Avoidance

The core ID generation strategy determines much of the system's scalability and correctness. Naive approaches like a single auto-increment counter create contention and single points of failure.

Hints to consider:

Use base62 encoding (a-z, A-Z, 0-9) for compact URL-safe codes; 7 characters give 62^7 (about 3.5 trillion) possibilities
Consider pre-allocated counter ranges per node, hash-based generation with collision detection, or a dedicated ID service like Snowflake
Discuss why base62 over base64 (avoiding URL-unsafe characters + and /)
Implement conditional writes or uniqueness checks to guarantee no two URLs share the same short code

2. Read Path Optimization and Caching

Redirect traffic is overwhelmingly read-heavy. Missing latency targets or hammering the primary datastore on every redirect is a critical red flag.

Hints to consider:

Deploy a multi-tier cache: CDN edge cache for the most popular links, a Redis layer for warm lookups, and the primary datastore as the fallback
Use HTTP 301 (permanent) vs 302 (temporary) redirects strategically based on whether you need analytics on every hit
Implement negative caching for non-existent short codes to protect the database from abuse traffic
Consider geo-distributed cache nodes or DynamoDB Global Tables for cross-region low latency

3. Analytics Ingestion Without Impacting Redirects

Coupling analytics writes to the redirect path raises tail latency and creates failure correlation. A backlog or partial outage in analytics should never break redirects.

Hints to consider:

Fire click events asynchronously to a message queue (Kafka) from the redirect service, then process downstream
Use append-only event streams for raw click data and batch-aggregate into time-series tables for dashboard queries
Consider approximate counters (HyperLogLog for unique visitors) to reduce write amplification
Separate the analytics read path (dashboards) from the redirect read path entirely

4. Abuse Prevention and Rate Limiting

URL shorteners are frequent targets for spam, phishing, and denial-of-service. Interviewers want to see you think about operational safety.

Hints to consider:

Rate limit link creation per user or IP to prevent mass generation of malicious links
Scan destination URLs against blocklists and safe-browsing APIs before creating a short link
Implement per-short-code rate limiting on the redirect path to mitigate DDoS amplification
Provide a reporting mechanism to flag and disable abusive links quickly

5. Data Model and Storage Choices

The core mapping from short code to URL is a classic key-value problem, but the full data model includes user ownership, metadata, and analytics.

Hints to consider:

A key-value store (DynamoDB) is ideal for the code-to-URL mapping: predictable low-latency, auto-scaling, conditional writes for uniqueness, and optional Global Tables for multi-region
Store link metadata (owner, creation time, expiration, status) alongside the mapping or in a relational store depending on query patterns
Use TTL features for links with expiration dates to automate cleanup
Keep analytics data in a separate time-series or columnar store optimized for aggregation queries

Suggested Approach

Step 1: Clarify Requirements

Step 2: High-Level Architecture

Step 3: Deep Dive on Code Generation and Read Path

Step 4: Address Secondary Concerns

Related Learning

Deepen your understanding of the patterns used in this problem:

Distributed Counters -- the analytics counting challenge at scale mirrors click tracking
Caching -- multi-tier caching is the key to low-latency redirects
Content Delivery Networks (CDN) -- serving redirects at the edge globally
Message Queues -- Kafka decouples the redirect hot path from analytics ingestion