Practice/Uber/Design an autocomplete search system for Uber
Design an autocomplete search system for Uber
System DesignOptional
Problem Statement
Design a location-based autocomplete system that provides real-time search suggestions as users type in the Uber app, with suggestions tailored to the user's current location and capable of handling 80K queries per second. Users expect suggestions to feel instant, reflect where they are, and get better with every keystroke.
The system returns addresses and points of interest biased to a rider's current location. It must blend full-text search, geospatial ranking, and personalization under strict tail-latency budgets at global scale. The core challenges are achieving sub-200 ms P99 latency for typeahead queries, handling hot prefixes (like "airp" or "sta") that concentrate traffic, keeping the location index fresh as POIs change, and incorporating user-specific signals (home, work, recent trips).
Interviewers at Uber ask this to assess whether you can prioritize low-latency reads, handle hot keys and bursty typing patterns, choose appropriate indexing and caching strategies, and operate a system that stays fresh and reliable.
Key Requirements
Functional
- Typeahead suggestions -- users see location suggestions that update in real time as they type, with low latency
- Location-biased results -- suggestions are ranked by proximity to the user's current location and recent trip history (home, work, recent pickups/drop-offs)
- Diverse suggestion types -- results include exact addresses, points of interest, neighborhoods, and airports with clear disambiguation
- Offline resilience -- users see recent suggestions available locally under weak connectivity
Non-Functional
- Scalability -- handle 80K+ queries per second at peak with global distribution
- Reliability -- 99.9% availability with graceful degradation to cached or recent results if the search tier is impaired
- Latency -- P99 under 200 ms end-to-end from keystroke to rendered suggestion list
- Freshness -- new POIs and address changes reflected in suggestions within hours
What Interviewers Focus On
Based on real interview experiences at Uber, these are the areas interviewers probe most deeply:
1. Indexing Strategy for Prefix + Geospatial Search
Combining prefix text matching with geographic proximity scoring is the core data challenge. Interviewers want to see how you model the index.
Hints to consider:
- Use Elasticsearch with completion suggesters or edge n-gram tokenizers for prefix matching combined with function_score for distance-based boosting
- Partition the search index by geographic region (city or country) so queries only search relevant data
- Include popularity signals in the scoring function: frequently selected locations rank higher
- Support multi-field matching: match on POI name, address, neighborhood, and category
2. Caching and Hot Prefix Mitigation
Common prefixes create traffic hotspots that can overwhelm the search tier. Interviewers probe your caching strategy.
Hints to consider:
- Cache results for (prefix, geocell) combinations in Redis with short TTLs (30-60 seconds)
- Use request coalescing: if multiple concurrent requests arrive for the same (prefix, geocell), execute the search once and share the result
- Pre-warm caches for the most common prefixes (top 1000 prefixes per city) during off-peak hours
- Implement client-side caching of recent queries and results to avoid redundant network requests during typing
3. Typeahead-Specific Traffic Management
Typing generates a burst of requests that can be reduced with client and server coordination. Interviewers want to see explicit debounce and cancellation strategies.
Hints to consider:
- Client-side debounce: wait 100-200 ms after the last keystroke before sending the request
- Cancel in-flight requests when a new keystroke arrives: only the most recent prefix matters
- Server-side rate limiting per user to prevent abuse (e.g., max 10 requests per second per user)
- Return partial results quickly (show cached popular results) while fetching more personalized results asynchronously
4. Personalization and Recent History
Users expect their home, work, and recent trip locations to appear prominently. Interviewers evaluate how you incorporate user-specific signals.
Hints to consider:
- Store per-user recent locations and saved places in a fast lookup (Redis or embedded in the user profile)
- Blend personalized results with search results: show recent/saved matches at the top if they match the prefix
- Update personalization signals asynchronously after trip completion (Kafka event to personalization service)
- Handle cold-start: new users see only geographic and popularity-based results until they build history