Practice/Atlassian/Design a Trending Hashtags System
Design a Trending Hashtags System
System DesignMust
Problem Statement
Design a highly scalable system that tracks and computes the top K trending hashtags across a social or collaboration platform. The system should support multiple time windows (last 5, 15, 30, and 60 minutes or custom intervals), geographic filtering (local versus global), and category-based filtering (food, sports, technology). It must serve billions of users and reflect trends in near real-time.
At Atlassian, this translates to surfacing trending topics or tags across Confluence and Jira -- showing teams what subjects are generating the most activity right now. The core challenge is building a streaming pipeline that can ingest billions of hashtag events daily, compute top-K rankings over sliding time windows without scanning raw logs on every query, handle extreme skew when a single hashtag goes viral, and serve results with low latency. Interviewers use this question to test your understanding of event-time processing, approximate algorithms, partitioning to avoid hotspots, and the tradeoff between accuracy and freshness.
Key Requirements
Functional
- Top-K trending view -- display the top K (30-50) hashtags for a chosen time window such as last 5, 15, 30, or 60 minutes
- Geographic filtering -- filter trending hashtags by scope including global, country, and city
- Category filtering -- segment trends by category such as food, sports, technology, or politics
- Near real-time freshness -- trend changes should be reflected within a few seconds of the underlying activity
Non-Functional
- Scalability -- handle 100M+ daily posts and billions of hashtag events with partitioned ingestion
- Latency -- p99 query latency under 100ms for trending hashtag retrieval
- Reliability -- tolerate node and datacenter failures without losing trend accuracy
- Consistency -- eventual consistency is acceptable; trends should converge within seconds
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Streaming Aggregation Pipeline
Interviewers want to see how you ingest, partition, and aggregate hashtag events into windowed counts without building up unbounded state or scanning raw logs at query time.
Hints to consider:
- Publish hashtag events to Kafka partitioned by hashtag ID (or a composite of hashtag and geo) for ordered, parallel processing
- Use Apache Flink with event-time sliding windows and watermarks to maintain running counts per hashtag per window
- Pre-aggregate at the edge or within application servers before publishing to reduce event volume on the message bus
- Emit incremental top-K updates from Flink to a serving layer rather than recomputing the full leaderboard on every query
2. Handling Hot Hashtags
A single viral hashtag can dominate traffic and create hotspots on counters and sorted sets. Interviewers assess your strategy for avoiding single-shard bottlenecks.
Hints to consider:
- Use sharded counters where a hot hashtag's events are distributed across N sub-counters and merged by a periodic combiner
- Consider Count-Min Sketch for approximate frequency estimation when exact counts are not required
- Apply local aggregation in each Flink task before a global merge to reduce cross-partition traffic
- Detect hot hashtags dynamically and route their events to additional shards automatically
3. Multi-Dimensional Filtering
Supporting filters by geography and category adds complexity to both the aggregation and serving layers. Interviewers probe whether you maintain separate pipelines or use a unified approach.
Hints to consider:
- Maintain separate sorted sets in Redis keyed by (window, geo, category) for fast filtered lookups
- Use Flink keyed state with composite keys (hashtag + geo + category) to aggregate per dimension
- Accept that the number of dimension combinations is bounded (a few hundred) and pre-compute each
- For custom or ad-hoc dimensions, fall back to a secondary index in Elasticsearch