p39 — Word Search
Source: LeetCode 79 · Medium-Hard · Topics: Array, Backtracking, Matrix, DFS Companies (2024–2025 frequency): Meta (very high), Amazon (high), Microsoft (high), Google (high), Apple (medium) Loop position: algorithm round, frequently the centerpiece. The “in-place visited marking” trick is a senior-vs-mid discriminator.
1. Quick Context
Given a 2D grid of letters and a target word, return True if the word can be formed by a path through horizontally/vertically adjacent cells, where each cell is used at most once.
The pattern is grid backtracking: for each cell matching word[0], DFS into the 4 neighbors trying to match word[1], word[2], etc., with in-place visited marking (overwrite the cell with a sentinel like '#', restore on backtrack).
def dfs(r, c, i):
if i == len(word): return True
if not in_bounds(r, c) or board[r][c] != word[i]: return False
board[r][c] = '#' # mark visited (in-place)
found = any(dfs(r+dr, c+dc, i+1) for dr, dc in DIRS)
board[r][c] = word[i] # restore
return found
Time: O(R · C · 4^L) worst case (L = word length). Space: O(L) recursion (no extra visited set).
The discriminators:
- In-place visited marking vs. a separate
visitedset (saves O(R·C) memory for the set per call). - Pruning — return immediately on first character mismatch.
- The Trie extension (LC 212 Word Search II): search for MANY words simultaneously via a Trie root.
2. LeetCode Link + Attempt Gate
🔗 https://leetcode.com/problems/word-search/
STOP. 20-min timer.
3. Prerequisite Concepts
- Grid DFS (Phase 01 Lab 09 — tree traversal generalizes).
- The 4-direction delta convention:
DIRS = ((-1,0),(1,0),(0,-1),(0,1)). - Backtracking template (p36).
4. How to Approach (FRAMEWORK Steps 1–9)
Step 1 — Restate: “Find ANY path in the grid spelling word, using each cell at most once. Return bool.”
Step 2 — Clarify:
- “Diagonal moves?” (No — 4-directional.)
- “Case-sensitive?” (Yes per LC.)
- “Can the same cell appear twice in the path?” (No.)
- “Grid mutation allowed?” (For in-place visited — verify with interviewer; usually yes if restored.)
- “Empty word?” (Per LC, word.length ≥ 1.)
- “What’s R, C, L?” (R, C ≤ 6 usually; L ≤ 10. Tight constraints because 4^L explodes.)
Step 3 — Constraints: R, C ≤ 6, L ≤ 15 typical → 36 starting cells × 4^15 ≈ 1B worst case, but pruning on first mismatch keeps it well under the limit in practice.
Step 4 — Examples:
[["A","B","C","E"],["S","F","C","S"],["A","D","E","E"]], "ABCCED"→ True.- Same grid,
"SEE"→ True. - Same grid,
"ABCB"→ False (can’t reuse the B).
Step 5 — Brute Force: Try every starting cell as the first letter; DFS in all 4 directions, tracking visited.
Step 6 — Complexity: O(R · C · 4^L) time. O(L) space if in-place. O(R · C + L) with separate visited set.
Step 7 — Pattern:
- “Find a path in a grid with character constraints” → grid backtracking.
- “In-place visited” is the standard memory optimization.
- “Many words at once” → Trie + LC 212.
Step 8 — Develop: see solution.py. In-place AND separate-set variants. Bonus: Trie-based LC 212 sketch.
Step 9 — Test: word starts at every corner, word loops back into used cell, word longer than grid, single-cell grid.
5. Progressive Hints
If stuck for 5 min, open hints.md.
6. Deeper Insight — Grid backtracking patterns
6.1 The DFS template for grid backtracking
DIRS = ((-1, 0), (1, 0), (0, -1), (0, 1))
def dfs(r, c, i):
# Base: matched the entire word
if i == len(word): return True
# Bounds + character match (BOTH checks here — single source of truth)
if r < 0 or r >= R or c < 0 or c >= C or board[r][c] != word[i]:
return False
# Mark visited (in-place sentinel)
saved = board[r][c]
board[r][c] = '#'
# Recurse on all 4 neighbors
for dr, dc in DIRS:
if dfs(r + dr, c + dc, i + 1):
board[r][c] = saved # restore before short-circuit return
return True
board[r][c] = saved # restore on backtrack
return False
The single-source-of-truth check (bounds + character + visitedness all in one if) is the bug-resistant version. Many candidates check bounds in the caller’s loop, character in the function, and visitedness separately — three places to be wrong.
6.2 In-place marking vs separate visited set — why it matters
- Separate set: O(R·C) memory per recursion stack frame’s visited copy (if you naively copy), OR a shared set + remove on backtrack (cleaner). Either way, hashing adds constant overhead.
- In-place: O(L) recursion stack only. No extra memory. Hot in practice.
Caveat: the function must NOT throw between mark and restore (it leaves the grid corrupted). Wrap restore in try/finally if the caller’s grid is precious.
6.3 Pruning order
The early-return on bounds + character mismatch is the entire ballgame. Without it, you explore 4^L paths for every starting cell — TLE.
Some implementations also count character frequencies in the grid vs the word before any DFS — if the grid lacks even one letter of the word, return False immediately. Saves a full search on impossible inputs. Cheap O(R·C + L) preprocessing.
Even better: if word[0] appears MORE often in the grid than word[-1], reverse the word before searching. Counter-intuitive but it works because we prune on first mismatch — starting from the rarer character gives fewer wasted DFS calls.
6.4 LC 212 — Word Search II (the Trie extension)
For K words, the naive approach runs LC 79 K times: O(K · R·C · 4^L). For K=10000 this is unacceptable.
The Trie approach builds a Trie of all K words, then does ONE DFS per starting cell, following Trie edges. When the DFS reaches a Trie node marked “end-of-word,” record the word and delete the leaf (so we don’t report it twice and to prune dead branches).
Complexity: O(R · C · 4^L) where L = max word length. The constant is small because the Trie heavily prunes (4 directions × Trie branching, but most paths die quickly).
Trie pruning is a staff+ signal for this problem family.
6.5 Why we DON’T memoize
This problem looks DP-shaped (“subproblem: can we match word[i:] starting at (r, c)?”) but it’s NOT memoizable: the answer depends not only on (r, c, i) but ALSO on which cells have been visited. The visited state is exponential. So backtracking is the right tool; DP is the wrong one.
This is a classic interview “trap” — the candidate who tries to memoize wastes 10 minutes.
7. Anti-Pattern Analysis
- Separate
visitedset passed by copy. O(R·C) memory per call × 4^L calls = death. - Mark visited BEFORE the bounds/char check. Corrupts the grid on invalid recursion.
- Restore visited ONLY on the False path (forget to restore on early-True return). Leaves grid corrupted; subsequent starting-cell searches see wrong values.
- Memoize
(r, c, i). Wrong because visitedness matters. Fundamentally unsound. - Iterate starting cells but skip the early character match.
if board[r][c] == word[0]: dfs(...). The inner DFS handles it, but the outer skip saves a function call per non-matching cell. - Allocate
DIRSinside the recursion. Tiny waste but a code smell — hoist to module scope. - For LC 212, run LC 79 once per word without Trie. Times out at K > 100.
- Path string concatenation (
path + chper call) — for this problem irrelevant since we don’t track path, but a common bug elsewhere.
8. Skills & Takeaways
- Grid backtracking template: single-source-of-truth check + in-place visited + restore.
- DIRS = ((-1,0),(1,0),(0,-1),(0,1)) — module-level constant.
- Pruning is everything: bounds + char-match is the single line that prevents TLE.
- In-place sentinel (
'#') trick — saves O(R·C) memory. - Trie extension for LC 212 — multi-word search.
- DON’T memoize — visited state is part of the subproblem.
9. When to Move On
Move on when:
- You can write the LC 79 template in under 10 min.
- You can sketch the Trie extension (LC 212) in under 15 min.
- You can articulate WHY memoization is unsound here.
- Stress test on 100 random grids agrees with a brute oracle.
10. Company Context
Meta:
- LC 79 is one of the most-asked Medium problems at Meta. Bar Raiser expectation: clean code, in-place visited, articulate complexity.
- LC 212 (Trie variant) is asked for L5+ — a “warm” candidate is expected to mention Trie immediately when “many words” comes up.
- Scorecard phrase: “Algorithm — clean grid backtracking, in-place visited, articulated 4^L complexity.”
Amazon:
- L5 staple. The “in-place visited” is often the deciding code-quality signal.
- Bar Raisers test the “now do it for 10000 words” follow-up.
Microsoft:
- L63. Often pairs LC 79 with “now path-track and return the path” (Word Search variant returning the cell sequence).
Google:
- L4/L5. Often combined with “now visualize the search frontier” to test BFS vs DFS reasoning.
Apple:
- Common in algo loops; education and accessibility teams use grid-pathing in features.
11. Interviewer’s Lens
| Signal | Junior | Mid | Senior | Staff/Principal |
|---|---|---|---|---|
| Visited | Set (copy per call) | Shared set + add/remove | In-place sentinel | + explains memory trade-off |
| Pruning | At leaf | At each call | Early bounds + char check in one if | + character-frequency precheck + reverse-word heuristic |
| LC 212 | “Run LC 79 K times” | “Trie?” (vague) | Trie + DFS along Trie nodes | + leaf-deletion + dead-branch pruning |
| Memoize? | Tries to memoize | “Maybe?” | “No — visited state is exponential” | + cites: depends on the search state, not just position |
12. Level Delta
Mid (L4):
- Works with separate visited set; complexity is fuzzy.
- Handles 1×1 grid, word = “A”.
Senior (L5):
- In-place visited; restores correctly on all paths.
- Cites O(R·C · 4^L).
- Mentions Trie for the multi-word variant.
Staff (L6):
- LC 212 with Trie + leaf-deletion.
- Articulates why memoization is unsound.
- Character-frequency precheck + reverse-word pruning.
- Returns path if asked, not just boolean.
Principal (L7+):
- Discusses production: text-on-image (OCR post-processing — search a known dictionary in extracted glyph grid), game-board solvers (Boggle, Scrabble), bioinformatics (sequence alignment in 2D scaffolds).
- Discusses parallelization: shard starting cells across workers; each worker has its own grid copy.
- Discusses approximate match: edit-distance Trie traversal for fuzzy word search (production spellcheck).
13. Follow-up Q&A
Q1: Return the actual path (cells), not just bool.
A: Maintain a path list of (r, c). Append/pop in lockstep with mark/restore. On success, deep-copy and return.
Q2: LC 212 — find ALL words from a dictionary present in the grid. A: Build a Trie of all words. DFS each starting cell, walking the Trie in parallel. On end-of-word, record + delete leaf to avoid dupes and dead branches.
Q3: What if cells can be reused?
A: Then memoization works: subproblem is (r, c, i) independent of history. O(R · C · L) time, O(R · C · L) memo.
Q4: 8-directional (include diagonals)?
A: DIRS = [(dr, dc) for dr in (-1,0,1) for dc in (-1,0,1) if (dr,dc) != (0,0)]. 8^L worst case — same algorithm, larger constant.
Q5: How to scale to a 1000×1000 grid? A: Same template. Worst-case complexity unchanged, but practical pruning (character frequency, starting from rarer characters) dominates. For DICTIONARY search (LC 212), Trie is essential.
14. Solution Walkthrough
See solution.py:
exist(board, word)— in-place visited, single-source-of-truth check.exist_with_set(board, word)— separatevisitedset (less efficient but doesn’t mutate input).exist_brute(board, word)— naive exhaustive search without pruning (for stress comparison; very slow).edge_cases()+stress_test()— random grids 3×3 to 4×4 with random words.
Run: python3 solution.py.
15. Beyond the Problem
Principal-engineer code review comment:
“Three things. (1) The in-place mutation is the key optimization but ALSO the dangerous one. If this function is called from multi-threaded code, you’ve created a race condition — make the contract crystal clear in the docstring (
Mutates board temporarily; not thread-safe) or accept a copy. (2) For LC 212, the leaf-deletion is non-obvious — write the test that exercises the case where the same prefix has 5 words ending under it; without leaf-deletion you’ll report duplicates AND fail to prune. (3) Consider adding amax_length_in_dictTrie-walk depth limit when the word lengths vary wildly — saves the DFS from continuing past the longest possible match at each starting cell.”
Connections forward:
- p36, p37, p38 — backtracking siblings.
- p40 — N-Queens: stateful backtracking with sets.
- LC 212 — Word Search II: Trie extension.
- LC 200 — Number of Islands: grid DFS without backtracking.
- LC 130 — Surrounded Regions: grid DFS with boundary trick.
- LC 1219 — Path with Maximum Gold: grid backtracking + value accumulation.
- Phase 03 Lab 07 — Trie Applications.
- Phase 04 Lab 02 — DFS / Connected Components.
- Production: Boggle/Scrabble solvers, OCR dictionary post-processing, game-board path validation, formal verification of small state-spaces.