Practice/Atlassian/Design a Tagging System for Atlassian Products
Design a Tagging System for Atlassian Products
System DesignMust
Problem Statement
Design a unified tagging system for Atlassian products -- Jira tickets, Confluence documents, and Bitbucket pull requests -- that allows users to add, remove, and rename tags on any content type, browse all items associated with a particular tag across products, and view a dashboard of the top K most popular tags. The system must maintain accurate usage analytics, enforce permissions, and deliver low-latency queries even when certain tags become extremely popular.
This is an Atlassian-specific question that tests your ability to build a cross-product platform feature. Think of it as hashtags for an enterprise productivity suite: users attach labels like "oncall", "q3-roadmap", or "bug" to items across multiple applications and can then discover everything related to that label in one view. Interviewers focus on the unified data model, cross-product search indexing, handling write contention on popular tags, permission enforcement across different product boundaries, and streaming analytics for the trending dashboard.
Key Requirements
Functional
- Tag CRUD -- users can add, remove, and rename tags on content across Jira tickets, Confluence documents, and Bitbucket PRs
- Cross-product browsing -- clicking a tag returns a unified, paginated list of all associated items from every product, with filtering by content type
- Trending dashboard -- display the top K most popular tags within configurable scopes such as organization, project, or time window
- Tag discovery -- autocomplete suggestions as users type, showing existing tags to encourage reuse and prevent duplicates
Non-Functional
- Scalability -- support 10M+ content items and 100K+ distinct tags with 10K concurrent users performing tag operations
- Latency -- tag search results within 500ms at p99; autocomplete within 100ms
- Reliability -- tag assignments must be durable; temporary outages in one product should not block operations in others
- Consistency -- tag changes appear in search results within 5 seconds; trending counts tolerate eventual consistency
What Interviewers Focus On
Based on real interview experiences at Atlassian, these are the areas interviewers probe most deeply:
1. Data Model and Storage Strategy
Interviewers want to see how you represent the many-to-many relationship between tags and content items when those items live in separate systems with different identifiers and schemas.
Hints to consider:
- Create a central Tag Service that owns the canonical mapping between tags and content, using a composite content identifier (source_app + native_id) for global uniqueness
- Store tag assignments in a relational database (PostgreSQL) with a junction table (content_id, tag_id, source_app) and indexes on both tag and content lookups
- Publish tag change events to Kafka for asynchronous consumers (search indexer, analytics aggregator) to decouple the write path from downstream processing
- Handle tag renames atomically: update the tag name in the canonical store and publish a rename event that consumers process to update their indexes
2. Cross-Product Search and Permissions
Aggregating results from multiple products while respecting each product's access control rules is a critical challenge. Information must never leak through tags.
Hints to consider:
- Build a unified Elasticsearch index where each document represents one content item with fields for content ID, source app, tag IDs, and permission tokens
- Enforce permissions at query time by intersecting the requesting user's permission tokens with each document's access control list
- Subscribe to permission change events from each product to keep the search index's ACL metadata current
- Consider caching permission checks per user session to reduce the overhead of per-document filtering
3. Hot Tag Contention
Certain tags ("incident", "urgent") become extremely popular and create write hotspots. Interviewers probe your strategy for handling thousands of concurrent tag operations without overwhelming a single database partition.
Hints to consider:
- Use sharded counters for tag popularity: distribute count increments across N shards in DynamoDB or Redis and merge periodically
- Make tag assignment operations idempotent using conditional writes (upsert with content_id + tag_id as the key) to handle retries safely
- Use an event-driven architecture where tag assignments are appended to a log rather than updating shared counters in place
- Cache read-heavy tag browse results in Redis with short TTLs and invalidate on write events
4. Trending Analytics
The top-K dashboard requires streaming aggregation similar to a trending hashtags system but scoped to the enterprise context.
Hints to consider:
- Use Flink or a similar stream processor consuming tag events from Kafka to maintain windowed counts per tag, scope, and time window
- Pre-compute top-K sorted sets in Redis keyed by (scope, time_window) and refresh every few seconds
- For the all-time view, maintain running counters and periodically recompute the full ranking via a batch job
- Expose the dashboard via an API that reads from the pre-computed Redis sets with caching at the CDN layer