Practice/LinkedIn/Design a Top-K System

Design a Top-K System

System DesignMust

Problem Statement

Design a system that efficiently retrieves top-k items (songs, videos, hashtags, etc.) based on user activity or engagement metrics within specified time windows. The system should handle real-time data aggregation and support queries like "top 10 songs in the last 7 days" or "trending hashtags in the past 24 hours."

Your system will ingest millions of scoring events per second from various sources, aggregate them across multiple time windows (hourly, daily, weekly, all-time), and serve low-latency queries to both end users and downstream services. The challenge lies in handling write-heavy workloads with extreme skew (viral content creates hotspots), maintaining accurate rankings despite out-of-order events, and delivering sub-100ms read latency while keeping infrastructure costs reasonable.

Key Requirements

Functional

Time-windowed rankings -- support multiple fixed windows (1 hour, 24 hours, 7 days, 30 days, all-time) with each window updating independently
Segmented leaderboards -- partition rankings by dimensions like geography (US, EU, Asia), category (music genre, game mode), or user attributes (age group, tier)
Configurable result size -- allow clients to request top-K where K ranges from 10 to 1000, with stable ordering when scores tie
Pagination and consistency -- enable page-through for large result sets with consistent snapshots across paginated requests

Non-Functional

Scalability -- handle 10M+ events per second during peak hours with the ability to scale write throughput independently of read capacity
Reliability -- tolerate datacenter failures and individual node crashes without losing more than 5 minutes of leaderboard history
Latency -- p99 read latency under 100ms for top-K queries, with write-to-visibility delay (freshness) under 10 seconds for non-viral content
Consistency -- eventual consistency is acceptable for leaderboards; rankings should converge within seconds and handle duplicate/late events gracefully

Interview Reports from Hello Interview

91 reports from candidates. Most recently asked at LinkedIn in Early January 2026.

Also commonly asked at: Meta, Uber, Atlassian, Bloomberg, Amazon, Google.

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Write Path and Event Aggregation

Interviewers want to see how you'll handle millions of incoming score events without creating database hotspots or losing data. They're testing whether you understand stream processing, event deduplication, and the difference between raw events and aggregated counts.

Hints to consider:

Use an event streaming platform (Kafka, Kinesis) as a durable buffer between producers and consumers to decouple ingestion from processing
Partition events by item_id or a composite key (window + segment + item_id) to enable parallel processing while maintaining ordering per item
Apply tumbling or sliding windows in a stream processor (Flink, Spark Streaming) to maintain per-window counts with watermarks handling late arrivals
Batch writes to the storage layer (flush every 5 seconds or 1000 events) to reduce write amplification and improve throughput

2. Handling Hot Keys and Skew

Viral content creates enormous skew where a single item receives 1000x more events than average. Interviewers assess whether you recognize this problem and can propose solutions beyond naive sharding.

Hints to consider:

Implement split counters where hot keys are sharded into N sub-counters (item_123_shard_0 through item_123_shard_N) and merged periodically
Use probabilistic counting (HyperLogLog for unique counts, Count-Min Sketch for frequency) when exact precision isn't required to reduce memory
Apply skew-aware routing where the ingestion layer detects hot items and distributes their events across more partitions dynamically
Separate the "top 10" (which changes rarely) from the "long tail" (constantly churning) using tiered storage with different update frequencies

3. Maintaining and Serving Top-K Efficiently

Computing top-K by scanning all items on every query is a critical mistake. Interviewers look for pre-computation strategies, appropriate data structures, and cache-friendly designs.

Hints to consider:

Pre-compute top-K lists for each window/segment combination using a stream processor with windowed aggregations that emit ranked results
Store pre-computed rankings in Redis sorted sets (ZADD with scores) or a key-value store with TTL matching the window duration
Use min-heaps of size K during aggregation to maintain top candidates efficiently (O(N log K) instead of O(N log N) for full sorting)
Implement tiered caching: application cache (in-memory, 30s TTL) then CDN (1 min TTL) then Redis (authoritative, 10s freshness) then batch recomputation fallback

4. Time Window Management and Freshness

Different time windows have different update patterns and expiration logic. Interviewers probe whether you understand sliding vs tumbling windows, retention policies, and how to balance freshness with cost.

Hints to consider:

Use tumbling windows for fixed-duration leaderboards (daily resets at midnight) and sliding windows for "last 24 hours" that continuously updates
Implement window expiration through TTLs on keys (Redis EXPIRE) or partition-level deletion in time-series databases
Trade off freshness for cost: update "last hour" every 30 seconds but "last 30 days" only every 5 minutes since it changes slowly
Handle late-arriving events with allowed lateness in stream processing (accept events up to 1 hour late, then emit final window results)

5. Read Path Optimization and API Design

The read path must serve thousands of queries per second with tight latency requirements. Interviewers assess your understanding of caching layers, API pagination, and read-write separation.

Hints to consider:

Design the API with explicit freshness parameters: GET /leaderboard/daily/global/top/100?asOf=2024-01-15T10:30:00Z for deterministic results
Return pagination tokens that encode window boundaries and position to ensure consistent page-through even as rankings update
Implement negative caching for non-existent segments (e.g., "top players in Antarctica") to prevent repeated expensive lookups
Use read replicas or materialized views for serving, completely separated from the write aggregation pipeline to isolate query load

Practice/LinkedIn/Design a Top-K System

Design a Top-K System

System DesignMust

Problem Statement

Key Requirements

Functional

Time-windowed rankings -- support multiple fixed windows (1 hour, 24 hours, 7 days, 30 days, all-time) with each window updating independently
Segmented leaderboards -- partition rankings by dimensions like geography (US, EU, Asia), category (music genre, game mode), or user attributes (age group, tier)
Configurable result size -- allow clients to request top-K where K ranges from 10 to 1000, with stable ordering when scores tie
Pagination and consistency -- enable page-through for large result sets with consistent snapshots across paginated requests

Non-Functional

Scalability -- handle 10M+ events per second during peak hours with the ability to scale write throughput independently of read capacity
Reliability -- tolerate datacenter failures and individual node crashes without losing more than 5 minutes of leaderboard history
Latency -- p99 read latency under 100ms for top-K queries, with write-to-visibility delay (freshness) under 10 seconds for non-viral content
Consistency -- eventual consistency is acceptable for leaderboards; rankings should converge within seconds and handle duplicate/late events gracefully

Interview Reports from Hello Interview

91 reports from candidates. Most recently asked at LinkedIn in Early January 2026.

Also commonly asked at: Meta, Uber, Atlassian, Bloomberg, Amazon, Google.

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Write Path and Event Aggregation

Hints to consider:

Use an event streaming platform (Kafka, Kinesis) as a durable buffer between producers and consumers to decouple ingestion from processing
Partition events by item_id or a composite key (window + segment + item_id) to enable parallel processing while maintaining ordering per item
Apply tumbling or sliding windows in a stream processor (Flink, Spark Streaming) to maintain per-window counts with watermarks handling late arrivals
Batch writes to the storage layer (flush every 5 seconds or 1000 events) to reduce write amplification and improve throughput

2. Handling Hot Keys and Skew

Hints to consider:

Implement split counters where hot keys are sharded into N sub-counters (item_123_shard_0 through item_123_shard_N) and merged periodically
Use probabilistic counting (HyperLogLog for unique counts, Count-Min Sketch for frequency) when exact precision isn't required to reduce memory
Apply skew-aware routing where the ingestion layer detects hot items and distributes their events across more partitions dynamically
Separate the "top 10" (which changes rarely) from the "long tail" (constantly churning) using tiered storage with different update frequencies

3. Maintaining and Serving Top-K Efficiently

Computing top-K by scanning all items on every query is a critical mistake. Interviewers look for pre-computation strategies, appropriate data structures, and cache-friendly designs.

Hints to consider:

Pre-compute top-K lists for each window/segment combination using a stream processor with windowed aggregations that emit ranked results
Store pre-computed rankings in Redis sorted sets (ZADD with scores) or a key-value store with TTL matching the window duration
Use min-heaps of size K during aggregation to maintain top candidates efficiently (O(N log K) instead of O(N log N) for full sorting)
Implement tiered caching: application cache (in-memory, 30s TTL) then CDN (1 min TTL) then Redis (authoritative, 10s freshness) then batch recomputation fallback

4. Time Window Management and Freshness

Hints to consider:

Use tumbling windows for fixed-duration leaderboards (daily resets at midnight) and sliding windows for "last 24 hours" that continuously updates
Implement window expiration through TTLs on keys (Redis EXPIRE) or partition-level deletion in time-series databases
Trade off freshness for cost: update "last hour" every 30 seconds but "last 30 days" only every 5 minutes since it changes slowly
Handle late-arriving events with allowed lateness in stream processing (accept events up to 1 hour late, then emit final window results)

5. Read Path Optimization and API Design

The read path must serve thousands of queries per second with tight latency requirements. Interviewers assess your understanding of caching layers, API pagination, and read-write separation.

Hints to consider:

Design the API with explicit freshness parameters: GET /leaderboard/daily/global/top/100?asOf=2024-01-15T10:30:00Z for deterministic results
Return pagination tokens that encode window boundaries and position to ensure consistent page-through even as rankings update
Implement negative caching for non-existent segments (e.g., "top players in Antarctica") to prevent repeated expensive lookups
Use read replicas or materialized views for serving, completely separated from the write aggregation pipeline to isolate query load