p73 — Interleaving String

Source: LeetCode 97 · Hard (often listed Medium-Hard) · Topics: DP, String Companies (2024–2025): Amazon (high), Microsoft (medium-high), Google (medium), Meta (medium) Loop position: The DP problem that exposes “why greedy fails.” Surface intuition says “always take from whichever string matches” — that’s wrong. Tests whether candidate can construct a counterexample and articulate why a 2D DP table is required.

1. Quick Context

Given strings s1, s2, s3. Return True iff s3 is an interleaving of s1 and s2 (preserving order within each).

Strategy: 2D DP. dp[i][j] = True iff s3[:i+j] is an interleaving of s1[:i] and s2[:j]. Recurrence:

  • dp[i][j] = (dp[i-1][j] AND s1[i-1]==s3[i+j-1]) OR (dp[i][j-1] AND s2[j-1]==s3[i+j-1]).

O(m·n) time and space; can reduce to O(min(m,n)) space.

🔗 https://leetcode.com/problems/interleaving-string/

STOP. 30-min timer. Must reject the greedy approach explicitly.

3. Prerequisite Concepts

  • 2D DP over two strings.
  • Why greedy (“take from whichever matches the current s3 char”) fails.
  • Length precondition: len(s3) == len(s1) + len(s2), else immediate False.

4. How to Approach (FRAMEWORK 1–9)

1. Restate: Can s3 be formed by interleaving s1 and s2 while preserving each string’s order?

2. Clarify: Order PRESERVED within each source. Each char in s3 must come from s1 OR s2, not both.

3. Examples: s1="aabcc", s2="dbbca", s3="aadbbcbcac" → True. s1="aabcc", s2="dbbca", s3="aadbbbaccc" → False.

5. Brute force: Recursion on (i, j) = pointers into s1 and s2 (s3 pointer = i+j). Exponential without memoization.

6. Target: O(m·n) DP.

7. Pattern: 2D string DP.

8. Develop: see solution.py.

9. Test: empty s3 with empty s1 and s2; greedy-counterexample case (s1="a", s2="aa", s3="aaa" style); length mismatch.

5. Progressive Hints

See hints.md.

6. Deeper Insight

6.1 Why greedy fails — concrete counterexample

s1 = "aabc", s2 = "abad", s3 = "aabadabc". Greedy at s3[2]=‘b’: could come from s1 (s1[2]=‘b’) or s2 (s2[1]=‘b’). Wrong choice leads to dead-end.

This is the classic “local choice doesn’t determine global outcome” pattern. When both pointers can match, both branches must be explored. Memoization makes this efficient.

6.2 The dp[i][j] definition

dp[i][j] = True iff s3[:i+j] is an interleaving of s1[:i] and s2[:j]. So dp[0][0] = True. Total answer is dp[m][n].

The recurrence’s two terms capture “last char came from s1” vs “last char came from s2”:

  • From s1: dp[i-1][j] AND s1[i-1] == s3[i+j-1].
  • From s2: dp[i][j-1] AND s2[j-1] == s3[i+j-1].

6.3 First row and column initialization

dp[0][j] = dp[0][j-1] AND s2[j-1]==s3[j-1] — purely from s2. dp[i][0] = dp[i-1][0] AND s1[i-1]==s3[i-1] — purely from s1.

6.4 Space optimization

Each row depends only on the previous row + current row’s left neighbor. Reduce to a 1D array of length min(m,n)+1. Standard rolling-array trick.

6.5 BFS framing

You can frame this as graph search: nodes are (i, j) states, edges are “advance s1 pointer” or “advance s2 pointer” when matching. Goal: reach (m, n). BFS or DFS over this implicit graph; same O(m·n).

6.6 Why this isn’t a trie / suffix-tree problem

The “preserving order within each” constraint means we can’t just ask “is s3’s multiset = s1’s + s2’s” — order matters. Trie/suffix structures don’t help directly; the DP is canonical.

6.7 LC 1216 Valid Palindrome III, LC 583 Delete Operation — same family

These are all “2D alignment DP” problems. Recognizing the family lets you write the recurrence in 30 seconds once the state is identified.

7. Anti-Pattern Analysis

  1. Greedy (“take whichever string matches first”) — fails on the standard counterexample.
  2. Forget length precondition — without len(s3) == len(s1)+len(s2), recurrence may “succeed” on impossible inputs (rare but real edge case at boundaries).
  3. 3D state (i, j, k) — wasteful; k = i+j is derived, not free.
  4. Recursion without memoization — exponential.
  5. Use dp[i][j] but iterate in wrong order — must ensure row-major or col-major order with proper neighbors.
  6. Compare full prefixes — quadratic per comparison; turns O(mn) into O(mn·max(m,n)).

8. Skills & Takeaways

  • 2D string alignment DP.
  • “Why greedy fails” recognition.
  • Constructing counterexamples for surface-correct heuristics.
  • Space optimization via rolling array.

9. When to Move On

  • Solve LC 97 cold in 25 min with correct O(mn) DP.
  • Articulate the greedy counterexample without prompting.
  • Reduce to O(min(m,n)) space.
  • Recognize the family (LC 583, 1143, 72, 115, 1216).

10. Company Context

  • Amazon: L5/L6; “DP and reject greedy” probe.
  • Microsoft: L65; less common but appears.
  • Google: L5; rare; testing thinking style more than algorithm.
  • Meta: E5; tests careful state definition.

Scorecard for strong: “Rejected greedy with concrete counterexample, defined dp[i][j] cleanly, recurrence in 30 seconds, optimized to 1D rolling.”

11. Interviewer’s Lens

SignalJuniorMidSeniorStaff+
ApproachGreedyDP after greedy failsDP cold, articulates greedy counterexample+ space optimization unprompted
State3D (i,j,k)2D (i,j)2D, explains k=i+j is derived+ rolling array + BFS framing
RecurrenceOff-by-oneCorrect after debuggingCold+ cites LC 583/1143 family
GeneralizationNoneNone“Other alignment problems”+ LP relaxation / sequence-alignment in bioinformatics (Needleman-Wunsch)

12. Level Delta

  • Mid (L4): DP after attempting greedy; correct after one iteration.
  • Senior (L5): Cold DP + counterexample for greedy + clean state + O(mn) confirmed.
  • Staff (L6): + 1D rolling + cites LC 583, 1143, 72, 115 as same family + bio sequence-alignment framing.
  • Principal (L7+): + Needleman-Wunsch / Smith-Waterman applications in bioinformatics + parallelism via anti-diagonal traversal + GPU implementations.

13. Follow-up Q&A

Q1: Return the interleaving (not just yes/no). A1: Backtrack from dp[m][n] along the True path; reconstruct char by char.

Q2: What if a char appears at both s1[i-1] and s2[j-1] (both match s3)? A2: Both branches valid; explore both. DP handles this naturally (OR in recurrence).

Q3: What if interleaving allows repetition (each char of s1/s2 can be used multiple times)? A3: Different problem — closer to LCS variants or unbounded-knapsack-of-chars.

Q4: How would you parallelize? A4: DP table can be filled anti-diagonal by anti-diagonal — each anti-diagonal independent, parallelize across threads/GPU lanes.

Q5: Bioinformatics connection? A5: Needleman-Wunsch (global sequence alignment) and Smith-Waterman (local alignment) are 2D DP with the same shape and similar recurrences. Production tools (BLAST) use heuristic versions.

14. Solution Walkthrough

See solution.py:

  • is_interleave_dp — iterative 2D DP.
  • is_interleave_1d — space-optimized rolling array.
  • is_interleave_memo — top-down with @lru_cache.
  • is_interleave_brute — naive recursion (oracle).

15. Beyond the Problem

Principal code review: “Interleaving String is a 2D-alignment DP whose value lies in teaching that greedy choices CAN be globally suboptimal even when each local choice ‘looks right.’ The structure — dp[i][j] indexed by progress in each input — is exactly the structure of Needleman-Wunsch global sequence alignment in bioinformatics, of edit distance, of LCS. These O(m·n) DPs power genome alignment tools that run on petabytes of sequencing data. Modern implementations exploit anti-diagonal parallelism (each anti-diagonal is independent), giving GPU speedups of 10–100×. When you write dp[i][j] = dp[i-1][j] OR dp[i][j-1] with the appropriate guards, you’re writing the kernel of an entire field of computational biology. Tell the interviewer this; they’ll remember.”