Level: Unknown Level
Round: Full Journey · Type: Coding · Difficulty: 7/10 · Duration: 180 min · Interviewer: Neutral
Topics: Hashing, Two Pointers, Aho-Corasick Algorithm, String Matching
Location: San Francisco Bay Area
Interview date: 2026-01-13
Question: Implement recipe matching using a rolling hash strategy.
Question: Implement recipe matching using O(1) extra space.
Question: Implement recipe matching for streaming data using the Aho-Corasick algorithm.
The problem is to determine if a list of recipes (each a list of ingredients) exists as a contiguous sequence within a main list of ingredients.
Part 1: Basic Solution (Hash Approach)
The initial solution uses a rolling hash to preprocess the ingredients list. I convert each ingredient into a number and compute hash codes for all possible sub-sequences, storing them in a hash map. To check a recipe, I compute its hash code and check if it exists in the map. The code provided uses two different prime numbers and bases to minimize collisions:
` from collections import defaultdict
class RecipeMatcher: _MOD1 = 1_000_000_007 _MOD2 = 1_000_000_009 _BASE1 = 911_382_323 _BASE2 = 972_663_749
def __init__(self, ingredients: list[str]):
self._token_id: dict[str, int] = {}
self._next_id = 1
self._hashes_by_len: dict[int, set[tuple[int, int]]] = defaultdict(set)
for token in ingredients:
self._get_token_id(token)
n = len(ingredients)
for i in range(n):
h1 = 0
h2 = 0
for j in range(i, n):
token_value = self._get_token_id(ingredients[j])
h1 = (h1 * self._BASE1 + token_value) % self._MOD1
h2 = (h2 * self._BASE2 + token_value) % self._MOD2
self._hashes_by_len[j - i + 1].add((h1, h2))
def _get_token_id(self, token: str) -> int:
if token not in self._token_id:
self._token_id[token] = self._next_id
self._next_id += 1
return self._token_id[token]
def can_make(self, recipe: list[str]) -> bool:
if not recipe:
return True
h1 = 0
h2 = 0
for token in recipe:
token_value = self._token_id.get(token)
if token_value is None:
# If an ingredient is missing from main list, recipe fails.
return False
h1 = (h1 * self._BASE1 + token_value) % self._MOD1
h2 = (h2 * self._BASE2 + token_value) % self._MOD2
return (h1, h2) in self._hashes_by_len[len(recipe)]
def match_recipes(ingredients: list[str], recipes: list[list[str]]) -> list[bool]: matcher = RecipeMatcher(ingredients) return [matcher.can_make(recipe) for recipe in recipes] `
Part 2: Low Memory Constraint
To reduce memory usage to O(1), I used a two-pointer approach. For each recipe, I iterate through the ingredients list using a sliding window. If a match is found, I return True; otherwise, I continue sliding the window. The code is as follows:
` def contains_recipe_o1_space(ingredients: list[str], recipe: list[str]) -> bool: n = len(ingredients) m = len(recipe)
if m == 0:
return True
if m > n:
return False
start = 0
while start <= n - m:
i = start
j = 0
while j < m and ingredients[i] == recipe[j]:
i += 1
j += 1
if j == m:
return True
start += 1
return False
def match_recipes_o1_space( ingredients: list[str], recipes: list[list[str]] ) -> list[bool]: return [contains_recipe_o1_space(ingredients, recipe) for recipe in recipes] `
Part 3: Streaming Data
For streaming data, I implemented the Aho-Corasick algorithm. This involves building a Trie from the recipes and creating fail links to handle partial matches efficiently. As ingredients arrive, I traverse the Trie and mark recipes as found when reaching the end of a recipe node. The code to build the trie and process the stream is given below:
` from collections import deque
class StreamRecipeMatcher: def init(self, recipes: list[list[str]]): self._recipes = recipes self._children = [dict()] # node -> {token: next_node} self._fail = [0] self._out = [[]] # node -> recipe ids ending here self._build_trie() self._build_fail_links()
def _new_node(self) -> int:
self._children.append({})
self._fail.append(0)
self._out.append([])
return len(self._children) - 1
def _build_trie(self) -> None:
for recipe_id, recipe in enumerate(self._recipes):
node = 0
for token in recipe:
if token not in self._children[node]:
self._children[node][token] = self._new_node()
node = self._children[node][token]
self._out[node].append(recipe_id)
def _build_fail_links(self) -> None:
q = deque()
for _, nxt in self._children[0].items():
self._fail[nxt] = 0
q.append(nxt)
while q:
node = q.popleft()
for token, nxt in self._children[node].items():
f = self._fail[node]
while f and token not in self._children[f]:
f = self._fail[f]
self._fail[nxt] = self._children[f].get(token, 0)
self._out[nxt].extend(self._out[self._fail[nxt]])
q.append(nxt)
def match_stream(self, ingredient_stream) -> list[bool]:
matched = [False] * len(self._recipes)
remaining = len(self._recipes)
state = 0
# Empty recipes match immediately, even if the stream has no tokens.
for recipe_id in self._out[0]:
if not matched[recipe_id]:
matched[recipe_id] = True
remaining -= 1
if remaining == 0:
return matched
for token in ingredient_stream:
while state and token not in self._children[state]:
state = self._fail[state]
state = self._children[state].get(token, 0)
for recipe_id in self._out[state]:
if not matched[recipe_id]:
matched[recipe_id] = True
remaining -= 1
if remaining == 0:
break
return matched
`