Practice/Amazon/Design a Top-K URL System for LinkedIn Posts

Design a Top-K URL System for LinkedIn Posts

System DesignMust

Problem Statement

Design a system that tracks and displays the top K most shared URLs from social media posts across different rolling time windows (5 minutes, 1 hour, 1 day). The system should handle high scale (10M+ articles shared per day) with rolling time windows and eventual consistency up to a minute.

This is a real-time leaderboard that surfaces the most-shared links over rolling windows. Users see which articles are trending right now. Interviewers use this problem to test your ability to design streaming pipelines, sliding-window aggregations, and low-latency read paths at high scale, including handling skew (viral links), event-time vs processing-time correctness, late/out-of-order data, and tradeoffs between exactness, freshness, and cost.

Key Requirements

Functional

Rolling leaderboards -- users view the top K most shared URLs for rolling windows of 5 minutes, 1 hour, and 1 day
URL consolidation -- rankings consolidate the same article even when links differ by tracking parameters, redirects, or shorteners
Near-real-time freshness -- users retrieve the leaderboard with approximately one-minute freshness and low read latency
Scoped views -- users view global leaderboards and optionally scoped leaderboards (by region or language) with the same freshness guarantees

Non-Functional

Scalability -- handle 10M+ unique URLs shared per day with viral URLs generating highly skewed traffic patterns
Reliability -- maintain leaderboard availability during processing failures with graceful degradation to slightly stale results
Latency -- leaderboard API responses under 100ms; processing pipeline maintains under one minute of lag
Consistency -- eventual consistency acceptable for rankings with convergence within one minute of event occurrence

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Streaming Aggregation with Time Windows

Computing accurate rolling counts from a firehose of share events requires careful windowing and event-time processing.

Hints to consider:

Use a stream processing framework (Flink) with event-time processing, watermarks, and sliding windows for each time granularity
Partition share events by normalized URL to enable parallel counting without global coordination
Handle late and out-of-order events by defining allowed lateness windows that update prior counts before finalizing
Implement hierarchical aggregation: aggregate per partition first, then merge partial top-K lists at a global level

2. URL Normalization and Deduplication

The same article can be shared with different tracking parameters, URL shorteners, or redirects, all of which must be consolidated.

Hints to consider:

Strip common tracking parameters (utm_source, utm_medium, fbclid) during ingestion to normalize URLs
Resolve URL shorteners and redirects asynchronously, caching resolved mappings with TTL
Use a canonical URL registry that maps variants to a single canonical form for counting
Handle edge cases where different URLs legitimately point to different content on the same domain

3. Hot Key Handling for Viral URLs

When a URL goes viral, millions of share events target the same counter, creating extreme write skew.

Hints to consider:

Use sharded counters where increments for a viral URL are distributed across N sub-keys, merged at query time
Implement approximate heavy-hitter detection (Count-Min Sketch or Space-Saving algorithm) for memory-efficient tracking
Pre-aggregate counts at the partition level in stream processors before merging into global leaderboards
Design the system to handle 10x traffic spikes on individual URLs without degrading overall pipeline throughput

4. Efficient Leaderboard Serving

Serving low-latency leaderboard queries while the underlying counts are continuously updating requires materialization and caching.

Hints to consider:

Materialize leaderboard snapshots into Redis sorted sets, atomically swapping old and new versions
Pre-compute leaderboards for all supported window sizes and scopes, serving from cache rather than computing on read
Version leaderboard snapshots so clients can detect staleness and the system can roll back to a previous version if processing fails
Use consistent hashing to distribute leaderboard serving across multiple cache nodes for high availability

Suggested Approach

Step 1: Clarify Requirements

Confirm scope with the interviewer. Ask about the value of K (top 10, 100, or 1000?), whether scoped leaderboards (by region, language) are required, how URL normalization should work (strip parameters vs resolve redirects), and acceptable staleness. Clarify whether the system needs to handle click-through events in addition to shares. Establish whether the leaderboard should show metadata (title, thumbnail) or just URLs and counts.

Practice/Amazon/Design a Top-K URL System for LinkedIn Posts

Design a Top-K URL System for LinkedIn Posts

System DesignMust

Problem Statement

Key Requirements

Functional

Rolling leaderboards -- users view the top K most shared URLs for rolling windows of 5 minutes, 1 hour, and 1 day
URL consolidation -- rankings consolidate the same article even when links differ by tracking parameters, redirects, or shorteners
Near-real-time freshness -- users retrieve the leaderboard with approximately one-minute freshness and low read latency
Scoped views -- users view global leaderboards and optionally scoped leaderboards (by region or language) with the same freshness guarantees

Non-Functional

Scalability -- handle 10M+ unique URLs shared per day with viral URLs generating highly skewed traffic patterns
Reliability -- maintain leaderboard availability during processing failures with graceful degradation to slightly stale results
Latency -- leaderboard API responses under 100ms; processing pipeline maintains under one minute of lag
Consistency -- eventual consistency acceptable for rankings with convergence within one minute of event occurrence

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Streaming Aggregation with Time Windows

Computing accurate rolling counts from a firehose of share events requires careful windowing and event-time processing.

Hints to consider:

Use a stream processing framework (Flink) with event-time processing, watermarks, and sliding windows for each time granularity
Partition share events by normalized URL to enable parallel counting without global coordination
Handle late and out-of-order events by defining allowed lateness windows that update prior counts before finalizing
Implement hierarchical aggregation: aggregate per partition first, then merge partial top-K lists at a global level

2. URL Normalization and Deduplication

The same article can be shared with different tracking parameters, URL shorteners, or redirects, all of which must be consolidated.

Hints to consider:

Strip common tracking parameters (utm_source, utm_medium, fbclid) during ingestion to normalize URLs
Resolve URL shorteners and redirects asynchronously, caching resolved mappings with TTL
Use a canonical URL registry that maps variants to a single canonical form for counting
Handle edge cases where different URLs legitimately point to different content on the same domain

3. Hot Key Handling for Viral URLs

When a URL goes viral, millions of share events target the same counter, creating extreme write skew.

Hints to consider:

Use sharded counters where increments for a viral URL are distributed across N sub-keys, merged at query time
Implement approximate heavy-hitter detection (Count-Min Sketch or Space-Saving algorithm) for memory-efficient tracking
Pre-aggregate counts at the partition level in stream processors before merging into global leaderboards
Design the system to handle 10x traffic spikes on individual URLs without degrading overall pipeline throughput

4. Efficient Leaderboard Serving

Serving low-latency leaderboard queries while the underlying counts are continuously updating requires materialization and caching.

Hints to consider:

Materialize leaderboard snapshots into Redis sorted sets, atomically swapping old and new versions
Pre-compute leaderboards for all supported window sizes and scopes, serving from cache rather than computing on read
Version leaderboard snapshots so clients can detect staleness and the system can roll back to a previous version if processing fails
Use consistent hashing to distribute leaderboard serving across multiple cache nodes for high availability