Practice/Stripe/Account Takeover Prediction System
Account Takeover Prediction System
System DesignMust
Problem Statement
Design a machine learning system to predict account takeover (ATO) risk for a payments API platform. Account takeover occurs when bad actors gain unauthorized access to legitimate user accounts through stolen credentials, session hijacking, or identity fraud.
The problem is intentionally open-ended about the specific use case. Before diving into the solution, you should clarify with the interviewer whether the focus is on login-time detection, session-level anomaly detection, or transaction-level risk scoring -- each leads to a different system design. The system must score risk in real time while balancing security (catching compromised accounts) against user experience (avoiding false lockouts of legitimate users).
Key Requirements
Functional
- Real-time risk scoring -- evaluate ATO risk at login, session activity, or transaction time and return a score within a strict latency budget
- Multi-signal feature engineering -- combine behavioral signals (login patterns, device fingerprints, IP reputation, geolocation) with historical account data
- Adaptive model retraining -- support periodic and triggered retraining as attacker tactics evolve over time
- Tiered response actions -- map risk scores to graduated responses such as allow, step-up authentication (MFA), temporary lock, or manual review
- Feedback loop integration -- incorporate user-confirmed ATO reports and false positive feedback to improve the model continuously
Non-Functional
- Low latency -- scoring must complete in tens of milliseconds to avoid degrading the login or transaction experience
- High availability -- the risk scoring service must be always-on; downtime means either blocking all logins or letting attackers through
- Scalability -- handle spikes in login volume (e.g., credential stuffing attacks generating millions of attempts)
- Privacy compliance -- handle sensitive user data (IP addresses, device info, location) in compliance with data retention and privacy regulations
- Monitoring and drift detection -- track model accuracy, feature distributions, and attacker pattern shifts in production
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Model Design and Feature Engineering (Most Emphasized)
Interviewers spend roughly 20 minutes on this area. They want to see a rich, well-structured feature set and a thoughtful model choice.
Hints to consider:
- Behavioral features: login time of day, login frequency, time since last login, typical session duration
- Device and network features: device fingerprint, IP address reputation, geolocation, IP-to-account velocity
- Historical features: rolling count of failed logins, number of distinct devices in the past 30 days, average transaction amount
- Consider both numerical and categorical features, and discuss preprocessing (missing value handling, standardization, categorical encoding)
- Gradient-boosted trees are a strong baseline for tabular data; discuss when you might add a sequence model for session-level patterns
- Address risks during development: data imbalance, feature redundancy, overfitting on historical attack patterns
2. System Architecture and Serving Infrastructure
Interviewers spend roughly 15 minutes probing the end-to-end system, with particular attention to the real-time serving path.
Hints to consider:
- Sketch a streaming pipeline for ingesting login and session events
- Include a feature store with online (low-latency) and offline (batch training) layers
- Show where the model serving layer sits in the authentication flow (before the auth decision is returned)
- Discuss data latency: how quickly do new signals (e.g., a just-reported stolen credential) propagate to the scoring service
- Cover online model update mechanisms -- can you do warm model swaps without downtime
- Include monitoring dashboards for prediction accuracy, latency percentiles, and system throughput
3. Business Impact and False Positive Trade-offs
Interviewers spend roughly 15 minutes connecting the ML system to business outcomes.
Hints to consider:
- Reducing the probability of successful account takeovers directly impacts user trust and platform revenue
- False positives (legitimate users locked out or forced through extra MFA) degrade user experience and increase support costs
- Discuss how you set the risk threshold: too aggressive locks out good users, too lenient lets attackers in
- Propose metrics for business impact: ATO rate reduction, false lockout rate, user friction index
- Consider tiered responses (step-up auth vs. hard block) to reduce friction for borderline cases
4. Handling Class Imbalance (Common Follow-up)
Interviewers frequently ask follow-up questions about train/test split strategies for imbalanced data.
Hints to consider:
- Oversampling minority class (SMOTE) or undersampling majority class
- Stratified sampling to preserve class distribution in train/test splits
- Cost-sensitive learning where misclassifying an ATO is penalized more heavily
- Evaluate with precision-recall curves and PR-AUC rather than accuracy
- Discuss how label quality affects imbalance -- many ATOs go unreported, leading to noisy negative labels
5. Clarifying Ambiguity and Scoping
The open-ended nature of this problem is itself a test. Interviewers watch whether you ask clarifying questions before jumping to a solution.
Hints to consider:
- Ask: Is the focus on login-time risk, session anomaly detection, or post-login transaction risk?
- Ask: What response actions are available (block, MFA challenge, flag for review)?
- Ask: What labeled data is available and how is ATO ground truth established?
- Scoping the problem well signals senior-level thinking