Practice/Amazon/Design a URL Shortener
Design a URL Shortener
System DesignMust
Problem Statement
Design a URL shortening service similar to TinyURL or Bitly that allows users to convert long URLs into short, shareable links, tracks click analytics, and manages link lifecycle. Users paste a long URL, receive a compact code like https://sho.rt/Ab3Cd, and anyone visiting that short link is redirected to the original address.
Interviewers ask this because it looks simple but exercises core distributed systems skills: globally unique ID generation without collisions, extreme read scaling (redirects dominate), low latency at the edge, write-heavy analytics capture, abuse prevention, and solid data modeling. It is a perfect canvas to see if you can define crisp requirements, estimate scale, choose the right storage and caching strategy, and make pragmatic tradeoffs around availability, consistency, and cost.
Key Requirements
Functional
- Link creation -- users submit a long URL and receive a unique short code; optionally specify a custom alias
- Redirect -- visiting a short URL redirects the user to the original long URL with low latency
- Link management -- authenticated users can view, disable, delete, or update their links via a dashboard
- Click analytics -- users can see basic analytics for each link including total clicks, time series, referrers, and geography
Non-Functional
- Scalability -- handle billions of stored links, 100K+ redirects per second, and thousands of link creations per second
- Reliability -- maintain 99.99% availability for redirects; analytics ingestion tolerates brief delays but no data loss
- Latency -- serve redirects in under 10ms from cache, under 50ms from database; link creation under 200ms
- Consistency -- strong consistency for link creation (no duplicate codes); eventual consistency acceptable for analytics
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Short Code Generation and Uniqueness
Generating short, unique codes at scale without collisions or hotspots is the foundational design decision. Interviewers test whether you understand the tradeoffs between different ID generation strategies.
Hints to consider:
- Use base-62 encoding of a 64-bit ID to produce compact codes (6-7 characters for billions of links)
- Generate IDs using a distributed ID service (Snowflake-style), pre-allocated ranges per server, or random generation with collision checking
- Avoid single auto-increment counters that create contention and single points of failure
- Support custom aliases by checking uniqueness at creation time with a conditional write
2. Read-Heavy Redirect Scaling
Redirect traffic is overwhelmingly read-heavy, often 100-1000x the write rate. Meeting global low-latency SLAs requires aggressive caching and edge serving.
Hints to consider:
- Cache short-code-to-URL mappings in Redis with long TTLs (links rarely change) for sub-millisecond lookups
- Deploy Redis or cache layers in multiple regions to minimize redirect latency globally
- Use CDN or edge compute for static redirects of popular links to reduce load on origin servers
- Implement negative caching for invalid short codes to prevent repeated database lookups
3. Analytics Capture Without Impacting Redirect Latency
Logging click data on the redirect path must not slow down the user experience. Interviewers want to see how you decouple analytics from the critical path.
Hints to consider:
- Capture click events asynchronously by publishing to Kafka from the redirect handler, never blocking the response
- Include metadata (timestamp, referrer, user agent, IP-based geo) in the click event for downstream processing
- Use stream processing to aggregate click data into time-bucketed summaries for dashboard queries
- Store raw events in a data warehouse for ad-hoc analysis and pre-aggregated summaries in a fast store for dashboards
4. Abuse Prevention and Link Management
URL shorteners are targets for spam, phishing, and link manipulation. Interviewers want to see defensive design patterns.
Hints to consider:
- Rate-limit link creation per IP and per authenticated user to prevent mass link generation
- Scan destination URLs against blocklists and safe-browsing APIs before creating short links
- Support link expiration with TTLs and explicit disabling by owners
- Implement abuse reporting that can flag and disable malicious links quickly
Suggested Approach
Step 1: Clarify Requirements
Start by confirming the scale and scope. Ask about expected link volume (how many total links, creation rate, redirect rate), whether custom aliases are required, link expiration policies, and analytics depth. Clarify geographic distribution: is this a single-region service or global? Determine whether authenticated user accounts are required or anonymous creation is supported. Establish latency targets for redirects versus creation versus analytics queries.