p73 — Interleaving String
Source: LeetCode 97 · Hard (often listed Medium-Hard) · Topics: DP, String Companies (2024–2025): Amazon (high), Microsoft (medium-high), Google (medium), Meta (medium) Loop position: The DP problem that exposes “why greedy fails.” Surface intuition says “always take from whichever string matches” — that’s wrong. Tests whether candidate can construct a counterexample and articulate why a 2D DP table is required.
1. Quick Context
Given strings s1, s2, s3. Return True iff s3 is an interleaving of s1 and s2 (preserving order within each).
Strategy: 2D DP. dp[i][j] = True iff s3[:i+j] is an interleaving of s1[:i] and s2[:j]. Recurrence:
dp[i][j] = (dp[i-1][j] AND s1[i-1]==s3[i+j-1]) OR (dp[i][j-1] AND s2[j-1]==s3[i+j-1]).
O(m·n) time and space; can reduce to O(min(m,n)) space.
2. LeetCode Link + Attempt Gate
🔗 https://leetcode.com/problems/interleaving-string/
STOP. 30-min timer. Must reject the greedy approach explicitly.
3. Prerequisite Concepts
- 2D DP over two strings.
- Why greedy (“take from whichever matches the current s3 char”) fails.
- Length precondition:
len(s3) == len(s1) + len(s2), else immediate False.
4. How to Approach (FRAMEWORK 1–9)
1. Restate: Can s3 be formed by interleaving s1 and s2 while preserving each string’s order?
2. Clarify: Order PRESERVED within each source. Each char in s3 must come from s1 OR s2, not both.
3. Examples: s1="aabcc", s2="dbbca", s3="aadbbcbcac" → True. s1="aabcc", s2="dbbca", s3="aadbbbaccc" → False.
5. Brute force: Recursion on (i, j) = pointers into s1 and s2 (s3 pointer = i+j). Exponential without memoization.
6. Target: O(m·n) DP.
7. Pattern: 2D string DP.
8. Develop: see solution.py.
9. Test: empty s3 with empty s1 and s2; greedy-counterexample case (s1="a", s2="aa", s3="aaa" style); length mismatch.
5. Progressive Hints
See hints.md.
6. Deeper Insight
6.1 Why greedy fails — concrete counterexample
s1 = "aabc", s2 = "abad", s3 = "aabadabc". Greedy at s3[2]=‘b’: could come from s1 (s1[2]=‘b’) or s2 (s2[1]=‘b’). Wrong choice leads to dead-end.
This is the classic “local choice doesn’t determine global outcome” pattern. When both pointers can match, both branches must be explored. Memoization makes this efficient.
6.2 The dp[i][j] definition
dp[i][j] = True iff s3[:i+j] is an interleaving of s1[:i] and s2[:j]. So dp[0][0] = True. Total answer is dp[m][n].
The recurrence’s two terms capture “last char came from s1” vs “last char came from s2”:
- From s1:
dp[i-1][j]ANDs1[i-1] == s3[i+j-1]. - From s2:
dp[i][j-1]ANDs2[j-1] == s3[i+j-1].
6.3 First row and column initialization
dp[0][j] = dp[0][j-1] AND s2[j-1]==s3[j-1] — purely from s2.
dp[i][0] = dp[i-1][0] AND s1[i-1]==s3[i-1] — purely from s1.
6.4 Space optimization
Each row depends only on the previous row + current row’s left neighbor. Reduce to a 1D array of length min(m,n)+1. Standard rolling-array trick.
6.5 BFS framing
You can frame this as graph search: nodes are (i, j) states, edges are “advance s1 pointer” or “advance s2 pointer” when matching. Goal: reach (m, n). BFS or DFS over this implicit graph; same O(m·n).
6.6 Why this isn’t a trie / suffix-tree problem
The “preserving order within each” constraint means we can’t just ask “is s3’s multiset = s1’s + s2’s” — order matters. Trie/suffix structures don’t help directly; the DP is canonical.
6.7 LC 1216 Valid Palindrome III, LC 583 Delete Operation — same family
These are all “2D alignment DP” problems. Recognizing the family lets you write the recurrence in 30 seconds once the state is identified.
7. Anti-Pattern Analysis
- Greedy (“take whichever string matches first”) — fails on the standard counterexample.
- Forget length precondition — without
len(s3) == len(s1)+len(s2), recurrence may “succeed” on impossible inputs (rare but real edge case at boundaries). - 3D state
(i, j, k)— wasteful; k = i+j is derived, not free. - Recursion without memoization — exponential.
- Use
dp[i][j]but iterate in wrong order — must ensure row-major or col-major order with proper neighbors. - Compare full prefixes — quadratic per comparison; turns O(mn) into O(mn·max(m,n)).
8. Skills & Takeaways
- 2D string alignment DP.
- “Why greedy fails” recognition.
- Constructing counterexamples for surface-correct heuristics.
- Space optimization via rolling array.
9. When to Move On
- Solve LC 97 cold in 25 min with correct O(mn) DP.
- Articulate the greedy counterexample without prompting.
- Reduce to O(min(m,n)) space.
- Recognize the family (LC 583, 1143, 72, 115, 1216).
10. Company Context
- Amazon: L5/L6; “DP and reject greedy” probe.
- Microsoft: L65; less common but appears.
- Google: L5; rare; testing thinking style more than algorithm.
- Meta: E5; tests careful state definition.
Scorecard for strong: “Rejected greedy with concrete counterexample, defined dp[i][j] cleanly, recurrence in 30 seconds, optimized to 1D rolling.”
11. Interviewer’s Lens
| Signal | Junior | Mid | Senior | Staff+ |
|---|---|---|---|---|
| Approach | Greedy | DP after greedy fails | DP cold, articulates greedy counterexample | + space optimization unprompted |
| State | 3D (i,j,k) | 2D (i,j) | 2D, explains k=i+j is derived | + rolling array + BFS framing |
| Recurrence | Off-by-one | Correct after debugging | Cold | + cites LC 583/1143 family |
| Generalization | None | None | “Other alignment problems” | + LP relaxation / sequence-alignment in bioinformatics (Needleman-Wunsch) |
12. Level Delta
- Mid (L4): DP after attempting greedy; correct after one iteration.
- Senior (L5): Cold DP + counterexample for greedy + clean state + O(mn) confirmed.
- Staff (L6): + 1D rolling + cites LC 583, 1143, 72, 115 as same family + bio sequence-alignment framing.
- Principal (L7+): + Needleman-Wunsch / Smith-Waterman applications in bioinformatics + parallelism via anti-diagonal traversal + GPU implementations.
13. Follow-up Q&A
Q1: Return the interleaving (not just yes/no).
A1: Backtrack from dp[m][n] along the True path; reconstruct char by char.
Q2: What if a char appears at both s1[i-1] and s2[j-1] (both match s3)?
A2: Both branches valid; explore both. DP handles this naturally (OR in recurrence).
Q3: What if interleaving allows repetition (each char of s1/s2 can be used multiple times)? A3: Different problem — closer to LCS variants or unbounded-knapsack-of-chars.
Q4: How would you parallelize? A4: DP table can be filled anti-diagonal by anti-diagonal — each anti-diagonal independent, parallelize across threads/GPU lanes.
Q5: Bioinformatics connection? A5: Needleman-Wunsch (global sequence alignment) and Smith-Waterman (local alignment) are 2D DP with the same shape and similar recurrences. Production tools (BLAST) use heuristic versions.
14. Solution Walkthrough
See solution.py:
is_interleave_dp— iterative 2D DP.is_interleave_1d— space-optimized rolling array.is_interleave_memo— top-down with@lru_cache.is_interleave_brute— naive recursion (oracle).
15. Beyond the Problem
Principal code review: “Interleaving String is a 2D-alignment DP whose value lies in teaching that greedy choices CAN be globally suboptimal even when each local choice ‘looks right.’ The structure —
dp[i][j]indexed by progress in each input — is exactly the structure of Needleman-Wunsch global sequence alignment in bioinformatics, of edit distance, of LCS. These O(m·n) DPs power genome alignment tools that run on petabytes of sequencing data. Modern implementations exploit anti-diagonal parallelism (each anti-diagonal is independent), giving GPU speedups of 10–100×. When you writedp[i][j] = dp[i-1][j] OR dp[i][j-1]with the appropriate guards, you’re writing the kernel of an entire field of computational biology. Tell the interviewer this; they’ll remember.”