Design a machine learning system that ranks news feed items based on relevance, user preferences, engagement history, and content freshness. The system must serve personalized rankings to millions of users in real-time, learn continuously from user interactions, and balance multiple objectives including engagement, content diversity, and business goals.
At Atlassian, this maps to personalizing the Confluence home feed or Jira dashboard -- surfacing the most relevant pages, tickets, or updates for each user based on their role, team activity, and past behavior. Interviewers use this question to test your understanding of the full ML system lifecycle: feature engineering, model architecture, training and serving infrastructure, feedback loops, and the operational challenges of deploying ML in production. Strong answers demonstrate a clear two-stage funnel (candidate generation followed by ranking), practical feature design, and awareness of cold start, position bias, and model freshness tradeoffs.
Based on real interview experiences, these are the areas interviewers probe most deeply:
Scoring billions of items per request is infeasible. Interviewers want to see a retrieval stage that narrows candidates before an expensive ranking model scores them.
Hints to consider:
The quality of the ranking depends heavily on the features you extract and the model architecture you choose.
Hints to consider:
Interviewers probe how you train models, deploy them, and keep them fresh without introducing latency or instability.
Hints to consider:
New users and new content lack the signals that drive personalization. Position bias means users engage more with top-ranked items regardless of actual relevance.
Hints to consider:
Confirm the type of content being ranked (articles, tickets, comments), the primary optimization objective (engagement, relevance, or a blend), the expected user base size, and latency requirements. Ask about the data available for training -- are impression logs and click logs already collected? Clarify whether the ranking must also handle ads or promoted content.
Sketch the major components: a Content Ingestion Service that processes new items and computes content features; a Feature Store serving pre-computed user and content features with both batch and real-time update paths; a Candidate Generation Layer using multiple ANN-based retrievers; a Ranking Service that scores candidates with an ML model; a Feed Assembly Service that applies business rules, diversity constraints, and pagination; and a Feedback Loop that captures user interactions and feeds them back into training data and real-time features.
Walk through a feed request end to end. The user opens the app, the feed service calls candidate generation (ANN search across multiple indexes), retrieves 1,000-2,000 candidates, fetches features from the feature store, sends the feature matrix to the ranking model for scoring, applies post-processing (diversity injection, deduplication, business rule boosts), and returns the top N items. Discuss how features are computed: batch features like "user's 30-day click-through rate by category" are precomputed hourly, while real-time features like "items clicked in this session" are computed on the fly from a streaming pipeline.
Cover model evaluation: use offline metrics (AUC, NDCG) for development and online A/B testing for production decisions, tracking engagement rate, dwell time, and daily active user retention. Discuss fallback strategies: if the ML service is down, serve a recency-based feed from a cached ranking. Address monitoring: track model latency, prediction distribution drift, feature freshness, and click-through rate trends. Explain how to prevent filter bubbles by injecting exploration items and monitoring content diversity in served feeds.
"Design a personalized ranking system for recommending content to users."