Practice/Meta/Design a People You May Know Service
Design a People You May Know Service
ML System DesignMust
Problem Statement
Build a scalable system that generates personalized friend suggestions for users on a social networking platform. The system should analyze various signals including mutual connections, shared interests, geographic proximity, workplace/education affiliations, and user interaction patterns to recommend people a user is likely to know or want to connect with. Your solution must handle hundreds of millions of active users, process billions of connection events daily, and return fresh recommendations with low latency. The system should balance recommendation quality with computational efficiency, and account for privacy considerations when surfacing suggestions.
Key Requirements
Functional
- Generate personalized recommendations -- produce a ranked list of suggested connections for each user based on multiple signals
- Update recommendations incrementally -- refresh suggestions as users form new connections and interact with the platform
- Support ranking customization -- allow different recommendation strategies based on user segments or experimental features
- Provide explanation signals -- indicate why a particular person is being recommended (mutual friends, shared group, etc.)
- Filter inappropriate suggestions -- exclude blocked users, previously rejected suggestions, and privacy-protected profiles
Non-Functional
- Scalability -- support 500M+ active users with 10B+ potential connection pairs to evaluate
- Reliability -- maintain 99.9% availability for recommendation serving; tolerate data pipeline failures gracefully
- Latency -- return top recommendations in under 200ms at p99; batch generation can tolerate hours of delay
- Consistency -- eventual consistency acceptable for recommendations; strong consistency required for blocking/privacy rules
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Graph-Based Recommendation Algorithms
Interviewers want to see you balance algorithmic sophistication with computational feasibility at massive scale. The naive approach of traversing the entire friend graph is computationally infeasible.
Hints to consider:
- Discuss collaborative filtering approaches like "friends of friends" and how to limit traversal depth
- Consider graph algorithms such as personalized PageRank or node embeddings for similarity computation
- Address the tradeoff between exact computations versus approximation techniques like locality-sensitive hashing
- Explain how to handle cold start problems for new users with few or no connections
2. Feature Engineering and Signal Processing
The quality of recommendations depends heavily on identifying and combining multiple signals effectively. Interviewers expect you to think beyond just mutual friends.
Hints to consider:
- Identify diverse signals: mutual friends, shared groups/pages, geographic proximity, workplace/education history, interaction patterns
- Discuss feature normalization and weighting strategies for combining heterogeneous signals
- Consider temporal signals like recent profile views, message exchanges, or event co-attendance
- Address privacy-preserving feature extraction from sensitive data like location or browsing behavior
3. System Architecture and Data Pipeline
The system involves both offline batch processing for expensive computations and online serving for low-latency retrieval. The architecture must elegantly separate these concerns.
Hints to consider:
- Design a Lambda architecture with batch processing for model training and candidate generation plus online serving layer
- Discuss data stores for different access patterns: graph databases for traversals, vector databases for similarity search, key-value stores for fast lookup
- Consider how to partition users or the social graph to parallelize computation effectively
- Address incremental update strategies to avoid full recomputation when the graph changes
4. Ranking and Personalization
Not all recommendations are equally valuable; the system must rank candidates and personalize based on user preferences and context.
Hints to consider:
- Discuss machine learning ranking models that combine multiple features into a relevance score
- Consider contextual signals like time of day, device type, or user engagement patterns
- Address diversity in recommendations to avoid filter bubbles and improve exploration
- Explain how to incorporate user feedback (accepted/rejected suggestions) to improve future recommendations
5. Privacy and Abuse Prevention
Friend suggestions can inadvertently leak information or be exploited for harassment, requiring careful consideration of privacy implications.
Hints to consider:
- Design privacy controls allowing users to opt out or limit discoverability through certain signals
- Implement safeguards against profile scraping or enumeration attacks through the recommendation API
- Address asymmetric visibility where one user may see another in suggestions but not vice versa
- Consider regulations like GDPR that restrict how personal data can be used for recommendations