Practice/Amazon/Design an Auto-Complete Feature for a Store's Search Bar
Design an Auto-Complete Feature for a Store's Search Bar
System DesignMust
Problem Statement
Design an auto-complete search feature for an e-commerce store that provides real-time suggestions as users type queries into the search bar. The system must return relevant suggestions within 100-200ms of each keystroke, support millions of concurrent users during peak shopping hours, and rank suggestions by a combination of popularity, recency, and user personalization.
At Amazon, interviewers ask this to evaluate whether you can build a low-latency, high-throughput read path that serves prefix-matched suggestions while continuously incorporating new search trends and product launches. The challenge lies in balancing index freshness with query latency, handling the bursty nature of keystroke traffic, and providing personalized results without adding latency to the critical path.
Key Requirements
Functional
- Prefix-based suggestions -- as users type each character, display a ranked list of 5-10 suggested queries or products matching the prefix
- Popularity-weighted ranking -- suggestions are ranked by search frequency, click-through rate, and optionally user-specific signals
- Trending and fresh content -- newly launched products and trending queries appear in suggestions within minutes
- Spell correction and fuzzy matching -- handle common misspellings and suggest corrections alongside prefix matches
Non-Functional
- Scalability -- handle millions of concurrent users with each generating 5-10 suggestion requests per search session
- Reliability -- maintain 99.9% availability; gracefully degrade by showing cached suggestions if the ranking service is slow
- Latency -- return suggestions within 100ms at p99 to feel instantaneous to users
- Consistency -- eventual consistency acceptable; new products or trending queries visible within 5 minutes
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Trie-Based Index Design
The core data structure for prefix matching is a trie or its space-optimized variant. Interviewers want to see if you understand how to build and serve prefix lookups efficiently at scale.
Hints to consider:
- Use a compressed trie (radix tree) to store suggestion terms, with each node holding aggregate popularity scores
- Pre-compute and store the top-K suggestions at each trie node to avoid traversal on every query
- Shard the trie by prefix range (a-f, g-m, etc.) across multiple servers for horizontal scaling
- Alternatively, use sorted sets in Redis with lexicographic range queries for a simpler operational model
2. Index Freshness and Update Pipeline
The suggestion index must incorporate trending queries and new products without rebuilding from scratch. Interviewers probe your update strategy.
Hints to consider:
- Use a streaming pipeline: log search queries and click events to Kafka, consume with aggregation workers that update popularity counts
- Rebuild suggestion indexes periodically (every 5 minutes) from aggregated data, and swap atomically (blue-green deployment of index)
- Maintain a hot buffer for very recent trending queries that gets merged with the main index at query time
- Use count-min sketch or approximate counters for real-time popularity tracking to avoid exact counting overhead
3. Client-Side Optimization and Debouncing
Every keystroke potentially triggers a server request. Interviewers expect strategies to reduce unnecessary load.
Hints to consider:
- Implement client-side debouncing (wait 50-100ms after last keystroke before sending request) to batch rapid typing
- Cache recent prefix results on the client; if the user types "sho" and already has results for "sh", filter locally first
- Use HTTP caching headers for common prefixes so CDN or browser cache serves repeat queries
- Abort in-flight requests when a new keystroke arrives to prevent stale suggestions from rendering
4. Personalization Without Latency Impact
Adding user-specific signals improves relevance but must not slow down the critical path. Interviewers look for offline-online hybrid approaches.
Hints to consider:
- Pre-compute per-user suggestion boosts (based on purchase history, recent searches) offline and store in a fast cache keyed by user_id
- At query time, fetch the user's boost profile from cache and re-rank the generic top-K suggestions
- Fall back to generic (non-personalized) suggestions if the personalization cache is unavailable
- Use A/B testing to measure the lift from personalization and justify the added complexity
Suggested Approach
Step 1: Clarify Requirements
Confirm scope with your interviewer. Ask about the suggestion source: query completions, product names, category names, or a mix. Determine the expected query volume and concurrency. Clarify latency targets and whether personalization is required. Ask about supported languages and whether fuzzy matching or spell correction is in scope. Establish how fresh suggestions need to be.