Practice/LinkedIn/Design Typeahead Search
Design Typeahead Search
System DesignMust
Problem Statement
Design a typeahead search system that displays real-time search suggestions as users type their queries. The system should focus on the retrieval and display of non-personalized recommendations, delivering sub-100ms responses at scale while handling hot prefixes and multi-region traffic.
The core challenge is balancing indexing, caching, ranking, and tail-latency control while handling hot keys and bursty traffic. You must justify latency budgets, choose the right data structures (prefix indexes, tries), and design a resilient GET-path that is cost-effective.
Key Requirements
Functional
- Real-time suggestions -- users see top-K suggestions update as they add or delete characters in the query (non-personalized)
- Prefix matching -- suggestions based on prefix matching with sensible handling of minor typos for longer prefixes
- Global consistency -- consistent, low-latency results globally with graceful behavior when there are no matches
- Selection action -- users select a suggestion to trigger a full search or navigate directly to the suggested item
Non-Functional
- Scalability -- handle millions of queries per second across global regions with read-heavy traffic patterns
- Reliability -- maintain availability during component failures with graceful degradation to cached results
- Latency -- p95 response time under 100ms including network round-trip; p99 under 200ms
- Freshness -- suggestions reflect new trending terms within minutes; popularity rankings update hourly
Interview Reports from Hello Interview
10 reports from candidates. Most recently asked at LinkedIn in Early January 2026.
Also commonly asked at: Meta.
What Interviewers Focus On
Based on real interview experiences, these are the areas interviewers probe most deeply:
1. Prefix Index Design and Data Structures
Interviewers want to see how you structure data for fast prefix lookups with ranked results.
Hints to consider:
- Use a trie or prefix tree for in-memory suggestion lookup with top-K results stored at each node
- Consider Elasticsearch completion suggesters with edge n-grams for more flexible matching including fuzzy prefix search
- Pre-compute top suggestions for common short prefixes (1-3 characters) since they're queried most frequently
- Store suggestions as (prefix -> sorted list of completions) pairs in Redis for O(1) lookup
2. Hot Prefix and Thundering Herd Mitigation
Short prefixes and trending terms cause extreme traffic concentration. A single "a" prefix might be queried millions of times.
Hints to consider:
- Cache results for short prefixes aggressively at multiple layers (CDN, application cache, Redis)
- Implement request coalescing: when multiple concurrent requests arrive for the same prefix, serve them from a single backend lookup
- Pre-warm caches for top-1000 most popular prefixes on startup and after index updates
- Apply per-prefix rate limits to protect the search tier from runaway query patterns
3. Client-Side Optimization
Reducing unnecessary backend requests is critical for both latency and cost. Interviewers expect client-side techniques.
Hints to consider:
- Implement client-side debouncing (100-200ms delay before sending request) to avoid queries on intermediate keystrokes
- Cache previous prefix results on the client and filter locally when the user continues typing the same prefix
- Abort in-flight requests when the user types another character, since the previous result is no longer needed
- Return compact payloads (just suggestion text and optional metadata) to minimize network transfer time
4. Index Update and Freshness Pipeline
Suggestions must reflect new content and trending terms without disrupting serving latency.
Hints to consider:
- Build new suggestion indexes offline (batch pipeline) and swap them atomically into the serving layer
- Use index aliases in Elasticsearch to enable zero-downtime rollovers when new indexes are ready
- Update popularity scores hourly using aggregated query/click data from analytics pipelines
- Support manual boosting/suppression of suggestions for editorial control or abuse prevention
Suggested Approach
Step 1: Clarify Requirements
Ask about the corpus: what are users searching for (people, content, companies)? Clarify whether suggestions are personalized or global. Confirm the expected query volume and geographic distribution. Ask about the latency budget end-to-end and what "real-time" means (every keystroke or debounced?). Understand how suggestions are ranked (popularity, recency, relevance) and how often rankings need to update.