Practice/Meta/Design a system to predict which users will attend Meta events in the real-world?

Design a system to predict which users will attend Meta events in the real-world?

ML System DesignMust

Problem Statement

You need to design a machine learning system that predicts whether users will actually attend events they've interacted with on a social platform. The platform hosts both private events (birthday parties, small gatherings) and public events (concerts, festivals, community meetups). Users can RSVP with four response types: "Going", "Interested", "Can't Go", or "No Response". However, user RSVPs often don't match actual attendance behavior -- people mark "Going" but don't show up, or mark "Interested" and actually attend.

The system should predict real-world attendance for individual users to power features like: better event recommendations, accurate attendee counts for organizers, improved newsfeed ranking of event posts, and reminders to users likely to forget. The platform has 500M monthly active users, 10M events created monthly, and generates 200M RSVP actions per month. Predictions should be updated in near-real-time as users interact with events (comments, likes, shares), and the system must handle both cold-start scenarios (new events, new users) and provide explanations for predictions.

Key Requirements

Functional

Attendance Prediction -- predict probability (0-1) that a specific user will physically attend a specific event
Real-Time Updates -- refresh predictions as users interact with event pages (likes, comments, shares, RSVP changes)
Cold Start Handling -- generate reasonable predictions for new users and newly created events
Prediction Explanations -- provide interpretable factors influencing the prediction (past behavior, social signals, event type)
Batch and Online Serving -- support both bulk predictions for all invited users and on-demand single predictions

Non-Functional

Scalability -- handle 10M active events, 500M users, 200M monthly predictions
Latency -- online predictions under 100ms p99, batch processing within 4 hours
Reliability -- 99.9% uptime for prediction service, graceful degradation if model unavailable
Model Quality -- maintain AUC > 0.80, recalibrate predictions weekly based on ground truth feedback

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Ground Truth Definition and Label Collection

Interviewers want to understand how you'll establish what "actual attendance" means and collect training labels, since users don't explicitly report whether they attended.

Hints to consider:

Consider proxy signals like photo check-ins, location data (with privacy controls), post-event engagement
Discuss time windows for labeling (how long after event to confirm attendance)
Address sampling strategies for negative examples (users who didn't attend)
Explain handling of uncertain labels and label noise in training data

2. Feature Engineering for Social Context

The richness of features, especially social graph signals and user behavior patterns, often determines model performance.

Hints to consider:

User features: historical attendance rate, RSVP-to-attendance gap, event type preferences, day-of-week patterns
Event features: public vs private, size, organizer reputation, category, time until event
Social features: number of friends attending, strength of connection to attendees, social proof signals
Engagement features: comment frequency, likes, shares, time spent on event page

3. Model Architecture and Training Pipeline

Interviewers expect discussion of appropriate ML algorithms for this ranking/prediction task and how to keep models fresh.

Hints to consider:

Consider gradient boosting (XGBoost/LightGBM) for tabular features vs neural networks for embedding-rich approaches
Discuss how to encode categorical features (event types, user demographics) and handle high cardinality
Explain training data generation from historical events with confirmed outcomes
Address model staleness and retraining frequency (daily, weekly) based on prediction drift

4. Real-Time Feature Computation and Serving

The system must update predictions as users interact, requiring efficient feature pipelines and low-latency serving infrastructure.

Hints to consider:

Separate features into static (pre-computed) vs dynamic (real-time) categories
Use feature stores to cache user and event embeddings with TTL policies
Discuss lambda architecture: batch for heavy features, streaming for interaction counts
Explain caching strategies for frequently accessed predictions

5. Cold Start and Model Interpretability

New events and users lack historical data, and predictions should be explainable to build user trust.

Hints to consider:

Use content-based features and similar user/event clustering for cold starts
Apply collaborative filtering or matrix factorization for initial embeddings
Provide SHAP values or feature importance scores to explain individual predictions
Fall back to simpler heuristics (geographic distance, friend attendance) when confidence is low

Practice/Meta/Design a system to predict which users will attend Meta events in the real-world?

Design a system to predict which users will attend Meta events in the real-world?

ML System DesignMust

Problem Statement

Key Requirements

Functional

Attendance Prediction -- predict probability (0-1) that a specific user will physically attend a specific event
Real-Time Updates -- refresh predictions as users interact with event pages (likes, comments, shares, RSVP changes)
Cold Start Handling -- generate reasonable predictions for new users and newly created events
Prediction Explanations -- provide interpretable factors influencing the prediction (past behavior, social signals, event type)
Batch and Online Serving -- support both bulk predictions for all invited users and on-demand single predictions

Non-Functional

Scalability -- handle 10M active events, 500M users, 200M monthly predictions
Latency -- online predictions under 100ms p99, batch processing within 4 hours
Reliability -- 99.9% uptime for prediction service, graceful degradation if model unavailable
Model Quality -- maintain AUC > 0.80, recalibrate predictions weekly based on ground truth feedback

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Ground Truth Definition and Label Collection

Interviewers want to understand how you'll establish what "actual attendance" means and collect training labels, since users don't explicitly report whether they attended.

Hints to consider:

Consider proxy signals like photo check-ins, location data (with privacy controls), post-event engagement
Discuss time windows for labeling (how long after event to confirm attendance)
Address sampling strategies for negative examples (users who didn't attend)
Explain handling of uncertain labels and label noise in training data

2. Feature Engineering for Social Context

The richness of features, especially social graph signals and user behavior patterns, often determines model performance.

Hints to consider:

User features: historical attendance rate, RSVP-to-attendance gap, event type preferences, day-of-week patterns
Event features: public vs private, size, organizer reputation, category, time until event
Social features: number of friends attending, strength of connection to attendees, social proof signals
Engagement features: comment frequency, likes, shares, time spent on event page

3. Model Architecture and Training Pipeline

Interviewers expect discussion of appropriate ML algorithms for this ranking/prediction task and how to keep models fresh.

Hints to consider:

Consider gradient boosting (XGBoost/LightGBM) for tabular features vs neural networks for embedding-rich approaches
Discuss how to encode categorical features (event types, user demographics) and handle high cardinality
Explain training data generation from historical events with confirmed outcomes
Address model staleness and retraining frequency (daily, weekly) based on prediction drift

4. Real-Time Feature Computation and Serving

The system must update predictions as users interact, requiring efficient feature pipelines and low-latency serving infrastructure.

Hints to consider:

Separate features into static (pre-computed) vs dynamic (real-time) categories
Use feature stores to cache user and event embeddings with TTL policies
Discuss lambda architecture: batch for heavy features, streaming for interaction counts
Explain caching strategies for frequently accessed predictions

5. Cold Start and Model Interpretability

New events and users lack historical data, and predictions should be explainable to build user trust.

Hints to consider:

Use content-based features and similar user/event clustering for cold starts
Apply collaborative filtering or matrix factorization for initial embeddings
Provide SHAP values or feature importance scores to explain individual predictions
Fall back to simpler heuristics (geographic distance, friend attendance) when confidence is low