Practice/Roblox/Design a skill-based matchmaking system
Design a skill-based matchmaking system
System DesignOptional
Problem Statement
Design a matchmaking system that pairs players based on their skill levels, replacing a basic random-assignment approach with an intelligent system that creates competitive, fair matches. The system should use a rating algorithm (such as Elo or Glicko-2) to track player skill over time, match players whose ratings indicate they will have closely contested games, and update ratings after each match based on the outcome.
While the previous matchmaking question focuses on the infrastructure of queuing and server allocation, this question goes deeper into the ranking algorithm, rating convergence, and statistical fairness of the matching process. At Roblox, where diverse game experiences range from casual to highly competitive, a skill-based system must be configurable: some games need tight skill bands for competitive integrity, while others prioritize fast matching with loose skill constraints. The system must also handle new players (cold start problem), returning players whose skill may have changed, and the psychological impact of rating visibility on player behavior.
Interviewers use this question to evaluate your understanding of rating algorithms, statistical modeling, the trade-offs between match quality and queue times, and how to build a system that adapts to different game contexts and player populations.
Key Requirements
Functional
- Skill rating tracking -- maintain a skill rating and confidence value for each player per game mode, updated after every match based on outcome and opponent strength
- Skill-based matching -- pair players or teams whose skill ratings predict a closely contested match, with configurable tolerance parameters per game mode
- Placement matches -- new players complete a calibration series where the system rapidly converges on their true skill level before assigning a stable rating
- Rating decay and recalibration -- reduce confidence in ratings for players who have been inactive, widening their matchmaking range until their skill is re-established
Non-Functional
- Scalability -- support rating calculations for 100 million player-game combinations and 1 million matches per hour across all game modes
- Latency -- rating lookups for matchmaking return within 10ms; post-match rating updates complete within 2 seconds
- Accuracy -- after 20 matches, predicted win probability should match actual outcomes within 5% for 80% of matches
- Fairness -- matches should produce win rates between 40-60% for each team in 85% of games during peak hours
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Rating Algorithm Selection and Implementation
The choice of rating algorithm determines how quickly the system converges on a player's true skill and how fairly it handles different scenarios. Interviewers want to see you reason about algorithm trade-offs.
Hints to consider:
- Elo is simple but assumes fixed skill variance; Glicko-2 adds a rating deviation (uncertainty) and volatility parameter that models how much a player's skill fluctuates
- TrueSkill (developed by Microsoft for Xbox) handles team games natively by computing each player's contribution to the team outcome
- Consider how the algorithm handles different game formats: 1v1, team-based, free-for-all, and asymmetric team sizes
- Discuss the K-factor (or equivalent learning rate) and how it should differ for new players (high, for fast convergence) versus established players (low, for stability)
2. Cold Start and New Player Experience
New players without rating history are difficult to match fairly. Interviewers probe how you handle the placement period without ruining the experience for either the new player or their opponents.
Hints to consider:
- Start new players at a default rating with high uncertainty; the first 10-20 matches use aggressive rating adjustments to converge quickly
- Consider using non-game signals (account age, performance in other game modes, time played) to seed an initial estimate
- During placement, match new players against a diverse range of opponents to gather maximum information per match
- Protect the experience for established players by limiting how many new players are placed in the same match
3. Dynamic Skill Window and Queue Time Balance
The skill window (how much rating difference is acceptable in a match) must expand dynamically based on queue conditions. Interviewers want to see how you balance match quality against wait time.
Hints to consider:
- Model the trade-off mathematically: define a utility function that weighs match quality (predicted win probability close to 50%) against wait time cost
- Expand the skill window gradually as a player waits, with the expansion rate configurable per game mode (competitive modes expand slowly, casual modes expand quickly)
- During off-peak hours, proactively suggest alternative game modes with shorter queues rather than creating poor-quality matches
- Track and report match quality metrics alongside wait times so game designers can tune the balance based on data
4. Team Balancing and Party Handling
In team games, the system must create balanced teams from a pool of matched players, which is a combinatorial optimization problem. Parties (premade groups) add additional constraints.
Hints to consider:
- After selecting 10 players for a 5v5 match, try multiple team splits and select the one minimizing the average rating difference between teams
- For parties, use the party's aggregate rating (average, or weighted toward the highest-rated member to prevent boosting) as the matching input
- Consider matching parties against parties of similar size to avoid the advantage that coordinated teams have over solo players
- Implement a "party handicap" that adjusts the expected win probability based on party coordination advantage, reducing rating gains for party wins
Suggested Approach
Step 1: Clarify Requirements
Ask about the game formats this system must support (1v1, small teams, large lobbies). Clarify whether the system serves a single game or is a platform service for many games with different rules. Ask about the existing rating data: are there historical match results to bootstrap ratings, or is this a greenfield system? Confirm whether ratings should be visible to players (affects design of display rating versus matchmaking rating). Ask about anti-abuse requirements: how should the system handle smurfing (skilled players creating new accounts), boosting (high-rated players carrying low-rated friends), and intentional losing to lower ratings? Finally, ask whether seasonal resets are required.