Practice/Apple/Design a URL Shortener
Design a URL Shortener
System DesignMust
Problem Statement
Design a URL shortening service like Bitly or TinyURL. Users paste a long URL and get back a short link (e.g., sho.rt/Ab3Cd). Clicking the short link redirects to the original URL. The system should also track basic analytics like click counts.
This problem looks simple but tests core distributed systems skills: generating globally unique short codes without collisions, handling extremely read-heavy traffic (redirects vastly outnumber creates), caching for low-latency redirects, and capturing analytics without slowing down the redirect path.
Key Requirements
Functional
- Shorten -- users submit a long URL and receive a unique short URL
- Redirect -- visiting a short URL redirects the user to the original long URL with minimal latency
- Analytics -- users can view click counts, referrers, and geographic breakdown for their short links
- Management -- users can view, disable, or delete their short links via a dashboard
Non-Functional
- Scalability -- support billions of short links and thousands of redirects per second
- Latency -- redirects should complete in under 50ms at P95
- Availability -- the redirect path must be highly available; analytics can tolerate brief delays
- Durability -- once created, a short link must work reliably for its entire lifetime
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Short Code Generation Without Collisions
You need a strategy to generate short, unique codes that works across multiple servers without coordination overhead.
Hints to consider:
- Base62 encoding (a-z, A-Z, 0-9) gives 62^7 = 3.5 trillion combinations for a 7-character code
- Consider pre-allocated ID ranges per server to avoid coordination, then Base62 encode the ID
- Alternatively, hash the URL (MD5/SHA256) and take the first N characters with collision detection
- Discuss the trade-off: counter-based (predictable, no collisions) vs. hash-based (no coordination, possible collisions)
2. Optimizing the Redirect Path
Redirects are the hot path — they happen orders of magnitude more often than creates. Every millisecond of latency matters.
Hints to consider:
- Put a caching layer (Redis or CDN edge) in front of the database to serve most redirects from memory
- Consider 301 (permanent) vs. 302 (temporary) redirects — 301 lets browsers cache but you lose analytics visibility
- Deploy cache nodes in multiple regions so redirects are served close to the user
- Discuss cache invalidation: what happens when a user disables or updates a short link?
3. Analytics Without Slowing Redirects
You want to count clicks and capture metadata (referrer, country, timestamp) without adding latency to the redirect response.
Hints to consider:
- Log click events to a message queue (Kafka) asynchronously after sending the redirect response
- Batch analytics events and aggregate them in a background pipeline rather than updating counters synchronously
- Accept eventual consistency for analytics — the click count can lag by a few seconds
- Consider sampling for extremely high-traffic links to reduce write volume