p73 — Interleaving String

Source: LeetCode 97 · Hard (often listed Medium-Hard) · Topics: DP, String Companies (2024–2025): Amazon (high), Microsoft (medium-high), Google (medium), Meta (medium) Loop position: The DP problem that exposes “why greedy fails.” Surface intuition says “always take from whichever string matches” — that’s wrong. Tests whether candidate can construct a counterexample and articulate why a 2D DP table is required.

1. Quick Context

Given strings s1, s2, s3. Return True iff s3 is an interleaving of s1 and s2 (preserving order within each).

Strategy: 2D DP. dp[i][j] = True iff s3[:i+j] is an interleaving of s1[:i] and s2[:j]. Recurrence:

dp[i][j] = (dp[i-1][j] AND s1[i-1]==s3[i+j-1]) OR (dp[i][j-1] AND s2[j-1]==s3[i+j-1]).

O(m·n) time and space; can reduce to O(min(m,n)) space.

2. LeetCode Link + Attempt Gate

🔗 https://leetcode.com/problems/interleaving-string/

STOP. 30-min timer. Must reject the greedy approach explicitly.

3. Prerequisite Concepts

2D DP over two strings.
Why greedy (“take from whichever matches the current s3 char”) fails.
Length precondition: len(s3) == len(s1) + len(s2), else immediate False.

4. How to Approach (FRAMEWORK 1–9)

1. Restate: Can s3 be formed by interleaving s1 and s2 while preserving each string’s order?

2. Clarify: Order PRESERVED within each source. Each char in s3 must come from s1 OR s2, not both.

3. Examples: s1="aabcc", s2="dbbca", s3="aadbbcbcac" → True. s1="aabcc", s2="dbbca", s3="aadbbbaccc" → False.

5. Brute force: Recursion on (i, j) = pointers into s1 and s2 (s3 pointer = i+j). Exponential without memoization.

6. Target: O(m·n) DP.

7. Pattern: 2D string DP.

8. Develop: see solution.py.

9. Test: empty s3 with empty s1 and s2; greedy-counterexample case (s1="a", s2="aa", s3="aaa" style); length mismatch.

6. Deeper Insight

6.1 Why greedy fails — concrete counterexample

s1 = "aabc", s2 = "abad", s3 = "aabadabc". Greedy at s3[2]=‘b’: could come from s1 (s1[2]=‘b’) or s2 (s2[1]=‘b’). Wrong choice leads to dead-end.

This is the classic “local choice doesn’t determine global outcome” pattern. When both pointers can match, both branches must be explored. Memoization makes this efficient.

6.2 The `dp[i][j]` definition

dp[i][j] = True iff s3[:i+j] is an interleaving of s1[:i] and s2[:j]. So dp[0][0] = True. Total answer is dp[m][n].

The recurrence’s two terms capture “last char came from s1” vs “last char came from s2”:

From s1: dp[i-1][j] AND s1[i-1] == s3[i+j-1].
From s2: dp[i][j-1] AND s2[j-1] == s3[i+j-1].

6.3 First row and column initialization

dp[0][j] = dp[0][j-1] AND s2[j-1]==s3[j-1] — purely from s2. dp[i][0] = dp[i-1][0] AND s1[i-1]==s3[i-1] — purely from s1.

6.4 Space optimization

Each row depends only on the previous row + current row’s left neighbor. Reduce to a 1D array of length min(m,n)+1. Standard rolling-array trick.

You can frame this as graph search: nodes are (i, j) states, edges are “advance s1 pointer” or “advance s2 pointer” when matching. Goal: reach (m, n). BFS or DFS over this implicit graph; same O(m·n).

6.6 Why this isn’t a trie / suffix-tree problem

The “preserving order within each” constraint means we can’t just ask “is s3’s multiset = s1’s + s2’s” — order matters. Trie/suffix structures don’t help directly; the DP is canonical.

6.7 LC 1216 Valid Palindrome III, LC 583 Delete Operation — same family

These are all “2D alignment DP” problems. Recognizing the family lets you write the recurrence in 30 seconds once the state is identified.

7. Anti-Pattern Analysis

Greedy (“take whichever string matches first”) — fails on the standard counterexample.
Forget length precondition — without len(s3) == len(s1)+len(s2), recurrence may “succeed” on impossible inputs (rare but real edge case at boundaries).
3D state (i, j, k) — wasteful; k = i+j is derived, not free.
Recursion without memoization — exponential.
Use dp[i][j] but iterate in wrong order — must ensure row-major or col-major order with proper neighbors.
Compare full prefixes — quadratic per comparison; turns O(mn) into O(mn·max(m,n)).

8. Skills & Takeaways

2D string alignment DP.
“Why greedy fails” recognition.
Constructing counterexamples for surface-correct heuristics.
Space optimization via rolling array.

9. When to Move On

Solve LC 97 cold in 25 min with correct O(mn) DP.
Articulate the greedy counterexample without prompting.
Reduce to O(min(m,n)) space.
Recognize the family (LC 583, 1143, 72, 115, 1216).

10. Company Context

Amazon: L5/L6; “DP and reject greedy” probe.
Microsoft: L65; less common but appears.
Google: L5; rare; testing thinking style more than algorithm.
Meta: E5; tests careful state definition.

Scorecard for strong: “Rejected greedy with concrete counterexample, defined dp[i][j] cleanly, recurrence in 30 seconds, optimized to 1D rolling.”

11. Interviewer’s Lens

Signal	Junior	Mid	Senior	Staff+
Approach	Greedy	DP after greedy fails	DP cold, articulates greedy counterexample	+ space optimization unprompted
State	3D `(i,j,k)`	2D `(i,j)`	2D, explains k=i+j is derived	+ rolling array + BFS framing
Recurrence	Off-by-one	Correct after debugging	Cold	+ cites LC 583/1143 family
Generalization	None	None	“Other alignment problems”	+ LP relaxation / sequence-alignment in bioinformatics (Needleman-Wunsch)

12. Level Delta

Mid (L4): DP after attempting greedy; correct after one iteration.
Senior (L5): Cold DP + counterexample for greedy + clean state + O(mn) confirmed.
Staff (L6): + 1D rolling + cites LC 583, 1143, 72, 115 as same family + bio sequence-alignment framing.
Principal (L7+): + Needleman-Wunsch / Smith-Waterman applications in bioinformatics + parallelism via anti-diagonal traversal + GPU implementations.

13. Follow-up Q&A

Q1: Return the interleaving (not just yes/no). A1: Backtrack from dp[m][n] along the True path; reconstruct char by char.

Q2: What if a char appears at both s1[i-1] and s2[j-1] (both match s3)? A2: Both branches valid; explore both. DP handles this naturally (OR in recurrence).

Q3: What if interleaving allows repetition (each char of s1/s2 can be used multiple times)? A3: Different problem — closer to LCS variants or unbounded-knapsack-of-chars.

Q4: How would you parallelize? A4: DP table can be filled anti-diagonal by anti-diagonal — each anti-diagonal independent, parallelize across threads/GPU lanes.

Q5: Bioinformatics connection? A5: Needleman-Wunsch (global sequence alignment) and Smith-Waterman (local alignment) are 2D DP with the same shape and similar recurrences. Production tools (BLAST) use heuristic versions.

14. Solution Walkthrough

See solution.py:

is_interleave_dp — iterative 2D DP.
is_interleave_1d — space-optimized rolling array.
is_interleave_memo — top-down with @lru_cache.
is_interleave_brute — naive recursion (oracle).

15. Beyond the Problem

Principal code review: “Interleaving String is a 2D-alignment DP whose value lies in teaching that greedy choices CAN be globally suboptimal even when each local choice ‘looks right.’ The structure — dp[i][j] indexed by progress in each input — is exactly the structure of Needleman-Wunsch global sequence alignment in bioinformatics, of edit distance, of LCS. These O(m·n) DPs power genome alignment tools that run on petabytes of sequencing data. Modern implementations exploit anti-diagonal parallelism (each anti-diagonal is independent), giving GPU speedups of 10–100×. When you write dp[i][j] = dp[i-1][j] OR dp[i][j-1] with the appropriate guards, you’re writing the kernel of an entire field of computational biology. Tell the interviewer this; they’ll remember.”

LeetCode Interview — Extreme Coding