p55 — Trapping Rain Water

Source: LeetCode 42 · Hard · Topics: Array, Two Pointers, Stack, Monotonic Stack, Dynamic Programming Companies (2024–2025 frequency): Google (very high), Meta (very high), Amazon (very high), Microsoft (high), Apple (medium-high), Bloomberg (high), HFT firms (very high) Loop position: The canonical Hard interview problem. If you can solve it THREE ways and articulate when each is preferable, you’ve internalized the array-pattern toolbox.

1. Quick Context

Given an elevation map height[] (non-negative ints, bar widths 1), compute how many units of water it traps after raining.

For each column i, the water above it = min(max_left[i], max_right[i]) - height[i] (if positive, else 0), where max_left[i] / max_right[i] are the tallest bars to the left/right (inclusive of i).

Three optimal solutions (all O(n)):

Solution A — Prefix max arrays

max_left  = running max of height left-to-right
max_right = running max right-to-left
total = sum(max(0, min(max_left[i], max_right[i]) - height[i]) for i in range(n))

O(n) time, O(n) space. Most intuitive.

Solution B — Two pointers

Maintain l, r, lmax, rmax. Move whichever pointer points to the shorter wall (it’s the bottleneck on that side); water at that index = wall_max - height[index].

l, r = 0, n - 1
lmax = rmax = 0
total = 0
while l < r:
    if height[l] < height[r]:
        lmax = max(lmax, height[l])
        total += lmax - height[l]
        l += 1
    else:
        rmax = max(rmax, height[r])
        total += rmax - height[r]
        r -= 1

O(n) time, O(1) space. Optimal in space.

Solution C — Monotonic stack

Maintain a stack of indices with STRICTLY DECREASING heights. When a taller bar arrives, pop the “bottom” and compute the rectangular water slice between left-of-popped (on stack) and current i.

stack = []
total = 0
for i, h in enumerate(height):
    while stack and height[stack[-1]] < h:
        bottom = stack.pop()
        if not stack:
            break
        left = stack[-1]
        width = i - left - 1
        bounded = min(h, height[left]) - height[bottom]
        total += width * bounded
    stack.append(i)

O(n) time, O(n) space. Computes water in horizontal slices instead of vertical columns.


🔗 https://leetcode.com/problems/trapping-rain-water/

STOP. 25-min timer. Aim for ALL THREE solutions.


3. Prerequisite Concepts

  • Prefix-max / suffix-max arrays (p31-p33).
  • Two-pointer convergence (p26).
  • Monotonic stack (p51-p54).
  • The “vertical column” vs “horizontal slice” reframing of area.

4. How to Approach (FRAMEWORK Steps 1–9)

Step 1 — Restate: For each column, water = min(tallest left wall, tallest right wall) - height[i] (clip at 0). Sum.

Step 2 — Clarify:

  • “Heights non-negative ints?” (Yes.)
  • “Length range?” (n ≤ 2×10^4 in LC.)
  • “Maximum height?” (10^5 in LC; matters for overflow in C++/Java, not Python.)
  • “Empty array → 0?” (Yes.)

Step 3 — Constraints: n ≤ 2×10^4. O(n²) might pass but is sloppy; aim O(n).

Step 4 — Examples:

  • [0,1,0,2,1,0,1,3,2,1,2,1] → 6 (LC canonical).
  • [4,2,0,3,2,5] → 9.
  • [3,0,2,0,4] → 7.
  • [1] → 0 (single column traps no water).
  • [3,3,3] → 0 (flat top).
  • [] → 0.
  • [5,4,3,2,1] → 0 (monotone; no right wall).
  • [1,2,3,4,5] → 0 (monotone; no left wall).

Step 5 — Brute force: For each i, scan left for max_left, right for max_right, sum. O(n²).

Step 6 — Complexity target: O(n) time. Bonus: O(1) space.

Step 7 — Pattern: Three techniques, all O(n):

  • Prefix max arrays (most intuitive).
  • Two pointers (best space).
  • Monotonic stack (horizontal-slice mental model).

Step 8 — Develop: see solution.py.

Step 9 — Test: monotone up; monotone down; flat; valleys; multiple basins; LC examples.


5. Progressive Hints

If stuck for 5 min, open hints.md.


6. Deeper Insight — Three lenses on the same answer

6.1 Vertical-column lens (Solutions A and B)

For each column i, water height = min(max_left[i], max_right[i]) - height[i]. Sum.

Solution A (prefix-max arrays) is the literal implementation: precompute both arrays in O(n) each, then sum in O(n).

Solution B (two pointers) is the SAME idea with a clever observation: as we converge l and r, the smaller of height[l] vs height[r] is provably the bottleneck on its side. Why? Because the other side has at least one wall (the larger value) that’s at least as tall as the current l_max or r_max. So whichever side has the shorter current pointer dictates the water level at that pointer’s column.

This lets us avoid storing the prefix arrays: we maintain lmax and rmax rolling.

6.2 Horizontal-slice lens (Solution C, monotonic stack)

Instead of “how much water sits above column i,” think “what horizontal slices of water fill in between bars?”

A horizontal slice forms whenever a tall bar appears next to a previously-shorter sequence ending at another tall bar. The monotonic decreasing stack tracks the “left walls”; when a taller bar h arrives, every shorter bar previously pushed gets a slice computed:

  • The popped bar = the BOTTOM of the slice.
  • The bar below it on the stack = the LEFT wall.
  • Current i = the RIGHT wall.
  • Slice height = min(h, height[left]) - height[bottom].
  • Slice width = i - left - 1.

Sum all slices = total water.

6.3 Why all three work / proof sketch

The two-pointer correctness rests on this lemma:

Lemma. If height[l] ≤ height[r], then min(max_left[l], max_right[l]) = max_left[l].

Proof. Because height[r] ≥ height[l], and r > l, we have max_right[l] ≥ height[r] ≥ height[l] ≥ max_left[l] once we account for lmax being correct. So max_right[l] ≥ max_left[l], so min(max_left[l], max_right[l]) = max_left[l].

The stack version is a different decomposition of the same sum — each unit of water gets counted exactly once.

The prefix-array version is the literal definition.

Three different perspectives, same answer.

6.4 Why are these all O(n)?

  • Prefix arrays: two passes of n, then one sum. Trivially O(n).
  • Two pointers: each iteration moves exactly one pointer; pointers can move at most n total. O(n).
  • Stack: each index pushed and popped at most once. Amortized O(n).

6.5 Why is the stack version “harder to get right”?

Because the “bounded height” computation min(h, height[left]) - height[bottom] requires careful attention:

  • h = height[i] = current bar (right wall).
  • height[left] = bar below bottom on the stack (left wall).
  • height[bottom] = the popped bar (the floor of the slice).
  • Bounded = the shorter wall - the floor.

Forgetting to break when the stack becomes empty (no left wall) is the #1 bug.

6.6 Memory-bandwidth comparison (real-world)

  • Prefix arrays: 2 extra arrays, 3 passes over data → 5n reads / 3n writes.
  • Two pointers: 0 extra arrays, 1 pass with both ends → 2n reads.
  • Stack: 1 stack (up to n), 1 pass → roughly 3n reads + 2n stack ops.

Two-pointer wins on cache locality and memory bandwidth. Pick two-pointer in a hot loop.

6.7 What if heights can be floating point or negative?

  • Floats: algorithms work unchanged; watch for precision when summing many tiny contributions.
  • Negative heights: problem definition assumes ≥ 0 (water sits on x-axis). Negatives would mean “depressions below ground level” — different problem.

6.8 LC 407 — Trapping Rain Water II (2D)

Generalization to a 2D heightmap. Cannot use two pointers directly. Standard solution: BFS/Dijkstra from the boundary inward using a min-heap of “wall heights at the frontier.” O(mn log(mn)).

Mention as the natural generalization in senior+ discussions.

6.9 LC 11 — Container With Most Water

Different problem (only two walls, no floor of the container counts). Same two-pointer principle though.


7. Anti-Pattern Analysis

  1. For each i, scan left and right for max each time — O(n²). On n = 2×10^4 it’s 4×10^8 — borderline TLE in tight Python; reliably TLE if the timer is strict.
  2. Initialize prefix max with 0 — fine for non-negative heights; fails subtly for empty prefix on the left of index 0 (use height[0] as the starting max_left[0]).
  3. In the two-pointer version, use <= instead of < for the bottleneck check — works on most inputs but may double-count one column on equal-walls cases. Convention matters; pick one and stick.
  4. In the stack version, forget the empty-stack check after pop — IndexError when the popped bar has no left wall.
  5. In the stack version, compute width = i - bottom instead of width = i - left - 1 — counts the popped bottom’s column itself, wrong.
  6. In the stack version, use <= instead of < in the pop condition — produces zero-width or negative-width slices for tied walls; may give correct sum by accident or may double-count.
  7. Use a DP approach dp[i] = water at i + dp[i-1] — there’s no clean DP recurrence; you end up reinventing the prefix-array approach with extra steps.
  8. Try to use a segment tree for range-max — works O(n log n) but adds complexity for no asymptotic gain. Don’t.
  9. Compute “wasted” water above the highest peak — there is none; water can only sit between walls.
  10. Forget the empty-input edge case — index errors.

8. Skills & Takeaways

  • Three lenses on the same problem: prefix-max columns, two-pointer columns, mono-stack horizontal slices.
  • Two-pointer lemma for “shorter wall is the bottleneck” — generalizes to other “min of two side bounds” problems.
  • Horizontal vs vertical decomposition of area — recurring pattern in geometry / accounting.
  • Memory bandwidth matters in production; two-pointer wins.
  • LC 42 → LC 11 → LC 407 progression.

9. When to Move On

  • You can write ALL THREE solutions cold in under 25 min total.
  • You can explain the two-pointer lemma in 30 seconds.
  • You can compute a worked example with the stack version step by step.
  • You can sketch LC 407 (2D) at a high level.
  • Stress test on 500 random arrays (size ≤ 20, heights 0..10) shows all three agree with brute O(n²).

10. Company Context

Google:

  • L5/L6/L7 onsite. Easily the most-asked Hard. The bar for L6+ is all three solutions in interview time with trade-off discussion.
  • Bar Raiser scorecard phrase: “Three solutions, articulated two-pointer lemma, mentioned LC 407 generalization, compared memory bandwidth.”

Meta:

  • E5/E6 onsite. Often paired with a second Hard. Must produce at least two solutions for E5.

Amazon:

  • L6+. The Bar Raiser favorite. Productized as “warehouse rainwater drainage capacity.”

Microsoft:

  • L65+. Often the first onsite question to gauge depth.

Apple:

  • Senior+ infrastructure roles. Less frequent but appears.

Bloomberg:

  • HFT/trading. Productized as “max drawdown of a stock signal.” Same math, different name.

HFT firms:

  • Very high. Will demand you discuss numerical precision (floats), branch prediction (two-pointer’s branch), and SIMD-friendliness (prefix arrays vectorize well).

11. Interviewer’s Lens

SignalJuniorMidSeniorStaff/Principal
Solutions producedOne (brute or DP)One optimal (usually prefix arrays)TWO (prefix + two-pointer)THREE + comparison
Two-pointer lemma“It just works”VagueFormal statement+ proof + generalization
Stack versionCan’t deriveMemorizedDerived from scratch+ horizontal-slice mental model articulated
LC 407Doesn’t know“I’ve heard of it”“Heap-based BFS from boundary”+ full sketch + complexity
Memory bandwidthDoesn’t mentionVaguely “two-pointer is better”Compares passes+ cache lines / SIMD friendliness

12. Level Delta

Mid (L4):

  • ONE correct solution (typically prefix arrays).
  • Correct on LC examples and edge cases.
  • O(n) time articulated.

Senior (L5):

  • TWO solutions (prefix arrays + two-pointer).
  • Two-pointer lemma stated clearly.
  • Mentions stack version exists.
  • Connects to LC 11.

Staff (L6):

  • THREE solutions with derivation.
  • Compares memory bandwidth / cache / space.
  • Sketches LC 407 (2D) with heap-based BFS from boundary.
  • Discusses streaming variant trade-offs.

Principal (L7+):

  • Discusses streaming: heights arrive one at a time; can you maintain trapped water incrementally? (Hard — adding a new bar can retroactively change water in existing basins. Requires careful invariant maintenance.)
  • Discusses distributed: heightmap partitioned across machines; how do you compute trapped water with bounded communication?
  • Discusses persistent version: snapshot heights at time t, query trapped water as of t.
  • Discusses 2D generalization (LC 407) in depth: BFS frontier with min-heap = Dijkstra-like; complexity O(mn log(mn)); why simpler approaches fail (water can flow diagonally between basins through saddle points).
  • Discusses numerical precision if heights are floats with large dynamic range (summation order matters; Kahan summation for stability).

13. Follow-up Q&A

Q1: LC 407 — Trapping Rain Water II (2D heightmap). Sketch. A1: Push all boundary cells into a min-heap (the “frontier”). Repeatedly pop the lowest-height boundary cell; for each unvisited neighbor, the water level there = max(neighbor_height, popped_height) (the wall containing it); add water = max(0, popped_height - neighbor_height); push the neighbor with its effective height. O(mn log(mn)).

Q2: What if heights can change (e.g., bar i becomes taller)? A2: Naive: recompute O(n) per update. Smarter: depends on what queries you support. If only “what’s total water?” after each update, you can maintain it with segment trees over max_left/max_right but it’s involved. In an interview, sketch the naive and discuss complexity.

Q3: Can you compute water trapped in O(1) extra space AND O(n) time? A3: Yes — two-pointer (Solution B).

Q4: What’s the worst-case stack size in Solution C? A4: n, achieved by a strictly decreasing input (nothing ever gets popped).

Q5: How does LC 42 relate to LC 11 (Container With Most Water)? A5: LC 11 = “find two walls that form the largest container; no floor matters.” Two-pointer principle: at each step, move the SHORTER wall inward (the longer can’t help; the shorter is the bottleneck). Same lemma in spirit. LC 42 = “every column traps water above it”; sum across all columns.

Q6: Streaming heights — can we report cumulative water after each new bar? A6: Adding a new tall bar can retroactively trap water in valleys that were “open” before. So no fully online O(1)-per-update algorithm exists in general. You can maintain the stack (Solution C) and emit slices as they’re resolved, but uncovered water requires the full history.

Q7: What if we want to RAIN k units of water and ask “where does it pool”? A7: Different problem. Use a priority queue / event-based simulation. Out of scope for LC 42.

Q8: Numerical stability — when bars are floating point with huge range? A8: Sum contributions in increasing-magnitude order (sorted) to minimize catastrophic cancellation. Or use Kahan summation. For typical interview ints, no concern.


14. Solution Walkthrough

See solution.py. Four implementations:

  • trap_prefix_arrays — Solution A: max_left + max_right precomputed. O(n) time, O(n) space.
  • trap_two_pointers — Solution B: converging pointers. O(n) time, O(1) space.
  • trap_stack — Solution C: monotonic decreasing stack, horizontal slices. O(n) time, O(n) space.
  • trap_brute — for each i, scan left/right for max. O(n²) oracle.

Stress test on 500 random arrays (size ≤ 20, heights 0..10): all four agree.


15. Beyond the Problem

Principal-engineer code review comment:

“Choose the right tool for the right reason. In a Python interview, the two-pointer version is the elegant winner — O(1) space, smallest code, hardest for the interviewer to find bugs in. In a C++ batch job over GB-scale heightmaps, the prefix-array version vectorizes (SIMD), wins on throughput, and parallelizes trivially. In an online system where bars arrive incrementally, the stack version is the only one that lets you emit partial answers without a full re-scan. Three solutions, three deployment fits — none is universally ‘the best.’ This is the difference between knowing an algorithm and understanding it.”