Practice/Amazon/Design a Top-K URL System for LinkedIn Posts
Design a Top-K URL System for LinkedIn Posts
System DesignMust
Problem Statement
Design a system that tracks and displays the top K most shared URLs from social media posts across different rolling time windows (5 minutes, 1 hour, 1 day). The system should handle high scale (10M+ articles shared per day) with rolling time windows and eventual consistency up to a minute.
This is a real-time leaderboard that surfaces the most-shared links over rolling windows. Users see which articles are trending right now. Interviewers use this problem to test your ability to design streaming pipelines, sliding-window aggregations, and low-latency read paths at high scale, including handling skew (viral links), event-time vs processing-time correctness, late/out-of-order data, and tradeoffs between exactness, freshness, and cost.
Key Requirements
Functional
- Rolling leaderboards -- users view the top K most shared URLs for rolling windows of 5 minutes, 1 hour, and 1 day
- URL consolidation -- rankings consolidate the same article even when links differ by tracking parameters, redirects, or shorteners
- Near-real-time freshness -- users retrieve the leaderboard with approximately one-minute freshness and low read latency
- Scoped views -- users view global leaderboards and optionally scoped leaderboards (by region or language) with the same freshness guarantees
Non-Functional
- Scalability -- handle 10M+ unique URLs shared per day with viral URLs generating highly skewed traffic patterns
- Reliability -- maintain leaderboard availability during processing failures with graceful degradation to slightly stale results
- Latency -- leaderboard API responses under 100ms; processing pipeline maintains under one minute of lag
- Consistency -- eventual consistency acceptable for rankings with convergence within one minute of event occurrence
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Streaming Aggregation with Time Windows
Computing accurate rolling counts from a firehose of share events requires careful windowing and event-time processing.
Hints to consider:
- Use a stream processing framework (Flink) with event-time processing, watermarks, and sliding windows for each time granularity
- Partition share events by normalized URL to enable parallel counting without global coordination
- Handle late and out-of-order events by defining allowed lateness windows that update prior counts before finalizing
- Implement hierarchical aggregation: aggregate per partition first, then merge partial top-K lists at a global level
2. URL Normalization and Deduplication
The same article can be shared with different tracking parameters, URL shorteners, or redirects, all of which must be consolidated.
Hints to consider:
- Strip common tracking parameters (utm_source, utm_medium, fbclid) during ingestion to normalize URLs
- Resolve URL shorteners and redirects asynchronously, caching resolved mappings with TTL
- Use a canonical URL registry that maps variants to a single canonical form for counting
- Handle edge cases where different URLs legitimately point to different content on the same domain
3. Hot Key Handling for Viral URLs
When a URL goes viral, millions of share events target the same counter, creating extreme write skew.
Hints to consider:
- Use sharded counters where increments for a viral URL are distributed across N sub-keys, merged at query time
- Implement approximate heavy-hitter detection (Count-Min Sketch or Space-Saving algorithm) for memory-efficient tracking
- Pre-aggregate counts at the partition level in stream processors before merging into global leaderboards
- Design the system to handle 10x traffic spikes on individual URLs without degrading overall pipeline throughput
4. Efficient Leaderboard Serving
Serving low-latency leaderboard queries while the underlying counts are continuously updating requires materialization and caching.
Hints to consider:
- Materialize leaderboard snapshots into Redis sorted sets, atomically swapping old and new versions
- Pre-compute leaderboards for all supported window sizes and scopes, serving from cache rather than computing on read
- Version leaderboard snapshots so clients can detect staleness and the system can roll back to a previous version if processing fails
- Use consistent hashing to distribute leaderboard serving across multiple cache nodes for high availability
Suggested Approach
Step 1: Clarify Requirements
Confirm scope with the interviewer. Ask about the value of K (top 10, 100, or 1000?), whether scoped leaderboards (by region, language) are required, how URL normalization should work (strip parameters vs resolve redirects), and acceptable staleness. Clarify whether the system needs to handle click-through events in addition to shares. Establish whether the leaderboard should show metadata (title, thumbnail) or just URLs and counts.