Practice/OpenAI/Crossword Puzzle Solver
Crossword Puzzle Solver
System DesignMust
Problem Statement
Design a service that solves crossword puzzles. Given a board with word slots (positions, directions, lengths) and a dictionary of ~1 million words, find any valid assignment of words that satisfies all intersection constraints.
The board is medium-sized (~50x50) with ~100 word slots to fill. This is NOT an algorithm optimization problem -- the interviewer wants to see you recognize that single-machine brute force is intractable (search space ~10^300) and design a distributed job scheduling system to parallelize the search.
Common mistake: Do not spend too much time optimizing the algorithm itself. The key insight is proving single-machine will not work, then designing how to distribute the search across workers, handle dead ends quickly, and dynamically split/redistribute tasks when some workers finish early or get stuck.
Key Requirements
Functional
- Solve puzzles -- given a board with word slots and a dictionary, find any valid word assignment
- Handle constraints -- words must fit their slots and share letters at intersections
- Return solution -- output the solved puzzle with words and their positions/directions
Out of Scope
- Puzzle generation (creating puzzles from scratch)
- Clue matching (clues may be present but are not used by the solver)
- Multiple solutions (finding one valid solution is sufficient)
- User interface or interactive editing
Non-Functional
- Board size -- ~50x50 with ~100 word slots
- Dictionary -- ~1 million words
- Latency -- minutes acceptable (batch computation, not real-time)
- Reliability -- must find a solution if one exists (completeness)
What Interviewers Focus On
Based on real interview experiences at OpenAI, these are the areas interviewers probe most deeply:
1. Proving Single-Machine is Intractable (Must Cover First)
The interviewer expects you to do a quick capacity estimation showing why brute force will not work, then move on to the distributed design.
Hints to consider:
- 100 word slots to fill, ~1,000 matching words per slot on average
- Naive search space: 1000^100 = 10^300 combinations
- Even with pruning, this is intractable on one machine
- Constraint propagation dramatically prunes the tree, but the pruned tree is still too large
- Spend 2-3 minutes on this, then move to distributed design
2. Distributed DFS with Dynamic Task Splitting (Core Design)
This is essentially a job scheduler problem. The core insight is parallelizing DFS via a shared task queue.
Hints to consider:
- Each task is a partial assignment (search state) representing a subtree to explore
- Workers pull tasks from a shared queue, explore locally with DFS
- When a worker encounters a state with many candidates, split into subtasks and enqueue them
- When candidates are few, explore sequentially to avoid queue overhead
- Use a
SPLIT_THRESHOLD to decide when to split vs. explore locally
- Dynamic splitting: track queue depth and worker idle percentage, adjust threshold accordingly
3. Constraint Propagation and Variable Ordering
The algorithm inside each worker matters -- not as the main design, but as a key optimization the interviewer will ask about.
Hints to consider:
- When assigning a word to a slot, propagate constraints to intersecting slots (eliminate words that no longer match)
- Most Constrained Variable (MCV) heuristic: fill the slot with fewest valid candidates first
- MCV detects dead ends earlier by tackling bottleneck slots first
- Forward checking: before assigning a word, verify all intersecting slots still have at least one valid candidate
- This is a classic CSP (Constraint Satisfaction Problem) technique
4. Dead-End Detection and Early Termination
Interviewers care about how the system handles failure paths and stops work when a solution is found.
Hints to consider:
- Dead-end detection: if no valid candidates exist for any remaining slot, backtrack immediately
- Forward checking catches dead ends before full exploration
- Early termination: when any worker finds a solution, broadcast cancellation to all others
- Use generation numbers on tasks so workers can discard stale work
- Detecting "no solution exists": when all tasks reach dead ends and no active tasks remain, the puzzle is unsolvable
5. Fault Tolerance
Hints to consider:
- Coordinator tracks task assignments with heartbeats
- If a worker dies, re-enqueue its active task
- Search state is persisted in database, not just worker memory
- ZooKeeper for coordinator leader election
- Workers reconnect automatically after coordinator failover