Practice/Meta/Design a system to predict which users will attend Meta events in the real-world?
Design a system to predict which users will attend Meta events in the real-world?
ML System DesignMust
Problem Statement
You need to design a machine learning system that predicts whether users will actually attend events they've interacted with on a social platform. The platform hosts both private events (birthday parties, small gatherings) and public events (concerts, festivals, community meetups). Users can RSVP with four response types: "Going", "Interested", "Can't Go", or "No Response". However, user RSVPs often don't match actual attendance behavior -- people mark "Going" but don't show up, or mark "Interested" and actually attend.
The system should predict real-world attendance for individual users to power features like: better event recommendations, accurate attendee counts for organizers, improved newsfeed ranking of event posts, and reminders to users likely to forget. The platform has 500M monthly active users, 10M events created monthly, and generates 200M RSVP actions per month. Predictions should be updated in near-real-time as users interact with events (comments, likes, shares), and the system must handle both cold-start scenarios (new events, new users) and provide explanations for predictions.
Key Requirements
Functional
- Attendance Prediction -- predict probability (0-1) that a specific user will physically attend a specific event
- Real-Time Updates -- refresh predictions as users interact with event pages (likes, comments, shares, RSVP changes)
- Cold Start Handling -- generate reasonable predictions for new users and newly created events
- Prediction Explanations -- provide interpretable factors influencing the prediction (past behavior, social signals, event type)
- Batch and Online Serving -- support both bulk predictions for all invited users and on-demand single predictions
Non-Functional
- Scalability -- handle 10M active events, 500M users, 200M monthly predictions
- Latency -- online predictions under 100ms p99, batch processing within 4 hours
- Reliability -- 99.9% uptime for prediction service, graceful degradation if model unavailable
- Model Quality -- maintain AUC > 0.80, recalibrate predictions weekly based on ground truth feedback
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Ground Truth Definition and Label Collection
Interviewers want to understand how you'll establish what "actual attendance" means and collect training labels, since users don't explicitly report whether they attended.
Hints to consider:
- Consider proxy signals like photo check-ins, location data (with privacy controls), post-event engagement
- Discuss time windows for labeling (how long after event to confirm attendance)
- Address sampling strategies for negative examples (users who didn't attend)
- Explain handling of uncertain labels and label noise in training data
2. Feature Engineering for Social Context
The richness of features, especially social graph signals and user behavior patterns, often determines model performance.
Hints to consider:
- User features: historical attendance rate, RSVP-to-attendance gap, event type preferences, day-of-week patterns
- Event features: public vs private, size, organizer reputation, category, time until event
- Social features: number of friends attending, strength of connection to attendees, social proof signals
- Engagement features: comment frequency, likes, shares, time spent on event page
3. Model Architecture and Training Pipeline
Interviewers expect discussion of appropriate ML algorithms for this ranking/prediction task and how to keep models fresh.
Hints to consider:
- Consider gradient boosting (XGBoost/LightGBM) for tabular features vs neural networks for embedding-rich approaches
- Discuss how to encode categorical features (event types, user demographics) and handle high cardinality
- Explain training data generation from historical events with confirmed outcomes
- Address model staleness and retraining frequency (daily, weekly) based on prediction drift
4. Real-Time Feature Computation and Serving
The system must update predictions as users interact, requiring efficient feature pipelines and low-latency serving infrastructure.
Hints to consider:
- Separate features into static (pre-computed) vs dynamic (real-time) categories
- Use feature stores to cache user and event embeddings with TTL policies
- Discuss lambda architecture: batch for heavy features, streaming for interaction counts
- Explain caching strategies for frequently accessed predictions
5. Cold Start and Model Interpretability
New events and users lack historical data, and predictions should be explainable to build user trust.
Hints to consider:
- Use content-based features and similar user/event clustering for cold starts
- Apply collaborative filtering or matrix factorization for initial embeddings
- Provide SHAP values or feature importance scores to explain individual predictions
- Fall back to simpler heuristics (geographic distance, friend attendance) when confidence is low