Practice/LinkedIn/Design Typeahead Search

Design Typeahead Search

System DesignMust

Problem Statement

Design a typeahead search system that displays real-time search suggestions as users type their queries. The system should focus on the retrieval and display of non-personalized recommendations, delivering sub-100ms responses at scale while handling hot prefixes and multi-region traffic.

The core challenge is balancing indexing, caching, ranking, and tail-latency control while handling hot keys and bursty traffic. You must justify latency budgets, choose the right data structures (prefix indexes, tries), and design a resilient GET-path that is cost-effective.

Key Requirements

Functional

Real-time suggestions -- users see top-K suggestions update as they add or delete characters in the query (non-personalized)
Prefix matching -- suggestions based on prefix matching with sensible handling of minor typos for longer prefixes
Global consistency -- consistent, low-latency results globally with graceful behavior when there are no matches
Selection action -- users select a suggestion to trigger a full search or navigate directly to the suggested item

Non-Functional

Scalability -- handle millions of queries per second across global regions with read-heavy traffic patterns
Reliability -- maintain availability during component failures with graceful degradation to cached results
Latency -- p95 response time under 100ms including network round-trip; p99 under 200ms
Freshness -- suggestions reflect new trending terms within minutes; popularity rankings update hourly

Interview Reports from Hello Interview

10 reports from candidates. Most recently asked at LinkedIn in Early January 2026.

Also commonly asked at: Meta.

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Prefix Index Design and Data Structures

Interviewers want to see how you structure data for fast prefix lookups with ranked results.

Hints to consider:

Use a trie or prefix tree for in-memory suggestion lookup with top-K results stored at each node
Consider Elasticsearch completion suggesters with edge n-grams for more flexible matching including fuzzy prefix search
Pre-compute top suggestions for common short prefixes (1-3 characters) since they're queried most frequently
Store suggestions as (prefix -> sorted list of completions) pairs in Redis for O(1) lookup

2. Hot Prefix and Thundering Herd Mitigation

Short prefixes and trending terms cause extreme traffic concentration. A single "a" prefix might be queried millions of times.

Hints to consider:

Cache results for short prefixes aggressively at multiple layers (CDN, application cache, Redis)
Implement request coalescing: when multiple concurrent requests arrive for the same prefix, serve them from a single backend lookup
Pre-warm caches for top-1000 most popular prefixes on startup and after index updates
Apply per-prefix rate limits to protect the search tier from runaway query patterns

3. Client-Side Optimization

Reducing unnecessary backend requests is critical for both latency and cost. Interviewers expect client-side techniques.

Hints to consider:

Implement client-side debouncing (100-200ms delay before sending request) to avoid queries on intermediate keystrokes
Cache previous prefix results on the client and filter locally when the user continues typing the same prefix
Abort in-flight requests when the user types another character, since the previous result is no longer needed
Return compact payloads (just suggestion text and optional metadata) to minimize network transfer time

4. Index Update and Freshness Pipeline

Suggestions must reflect new content and trending terms without disrupting serving latency.

Hints to consider:

Build new suggestion indexes offline (batch pipeline) and swap them atomically into the serving layer
Use index aliases in Elasticsearch to enable zero-downtime rollovers when new indexes are ready
Update popularity scores hourly using aggregated query/click data from analytics pipelines
Support manual boosting/suppression of suggestions for editorial control or abuse prevention

Suggested Approach

Step 1: Clarify Requirements

Ask about the corpus: what are users searching for (people, content, companies)? Clarify whether suggestions are personalized or global. Confirm the expected query volume and geographic distribution. Ask about the latency budget end-to-end and what "real-time" means (every keystroke or debounced?). Understand how suggestions are ranked (popularity, recency, relevance) and how often rankings need to update.

Practice/LinkedIn/Design Typeahead Search

Design Typeahead Search

System DesignMust

Problem Statement

Key Requirements

Functional

Real-time suggestions -- users see top-K suggestions update as they add or delete characters in the query (non-personalized)
Prefix matching -- suggestions based on prefix matching with sensible handling of minor typos for longer prefixes
Global consistency -- consistent, low-latency results globally with graceful behavior when there are no matches
Selection action -- users select a suggestion to trigger a full search or navigate directly to the suggested item

Non-Functional

Scalability -- handle millions of queries per second across global regions with read-heavy traffic patterns
Reliability -- maintain availability during component failures with graceful degradation to cached results
Latency -- p95 response time under 100ms including network round-trip; p99 under 200ms
Freshness -- suggestions reflect new trending terms within minutes; popularity rankings update hourly

Interview Reports from Hello Interview

10 reports from candidates. Most recently asked at LinkedIn in Early January 2026.

Also commonly asked at: Meta.

What Interviewers Focus On

Based on real interview experiences, these are the areas interviewers probe most deeply:

1. Prefix Index Design and Data Structures

Interviewers want to see how you structure data for fast prefix lookups with ranked results.

Hints to consider:

Use a trie or prefix tree for in-memory suggestion lookup with top-K results stored at each node
Consider Elasticsearch completion suggesters with edge n-grams for more flexible matching including fuzzy prefix search
Pre-compute top suggestions for common short prefixes (1-3 characters) since they're queried most frequently
Store suggestions as (prefix -> sorted list of completions) pairs in Redis for O(1) lookup

2. Hot Prefix and Thundering Herd Mitigation

Short prefixes and trending terms cause extreme traffic concentration. A single "a" prefix might be queried millions of times.

Hints to consider:

Cache results for short prefixes aggressively at multiple layers (CDN, application cache, Redis)
Implement request coalescing: when multiple concurrent requests arrive for the same prefix, serve them from a single backend lookup
Pre-warm caches for top-1000 most popular prefixes on startup and after index updates
Apply per-prefix rate limits to protect the search tier from runaway query patterns

3. Client-Side Optimization

Reducing unnecessary backend requests is critical for both latency and cost. Interviewers expect client-side techniques.

Hints to consider:

Implement client-side debouncing (100-200ms delay before sending request) to avoid queries on intermediate keystrokes
Cache previous prefix results on the client and filter locally when the user continues typing the same prefix
Abort in-flight requests when the user types another character, since the previous result is no longer needed
Return compact payloads (just suggestion text and optional metadata) to minimize network transfer time

4. Index Update and Freshness Pipeline

Suggestions must reflect new content and trending terms without disrupting serving latency.

Hints to consider:

Build new suggestion indexes offline (batch pipeline) and swap them atomically into the serving layer
Use index aliases in Elasticsearch to enable zero-downtime rollovers when new indexes are ready
Update popularity scores hourly using aggregated query/click data from analytics pipelines
Support manual boosting/suppression of suggestions for editorial control or abuse prevention