p24 — Pacific Atlantic Water Flow

Source: LeetCode 417 · Medium · Topics: BFS, DFS, Matrix, Multi-Source BFS Companies (2024–2025 frequency): Amazon (very high), Google (high), Meta (medium), Microsoft (medium), Bloomberg (medium), Apple (medium) Loop position: mid-onsite. Sorting signal between “knows BFS” candidates and “knows BFS inversions” candidates.

1. Quick Context

An M×N grid of heights. Water flows from a cell to any 4-neighbor with equal or lower height. Pacific touches the top and left edges; Atlantic touches the bottom and right edges. Return all cells from which water can reach both oceans.

The naive solution is “for each cell, BFS forward to check if it reaches both oceans” → O((MN)²) — borderline TLE on large grids.

The right solution is the multi-source BFS inversion trick:

  • BFS from each ocean edge inward, walking only to neighbors that are current height (reverse of the flow rule).
  • The set of cells reachable from the Pacific edges = cells that can drain to Pacific.
  • Same for Atlantic.
  • Answer = intersection.

Total work: O(MN). The inversion is the entire interview signal.

What it looks like it tests: can you do BFS/DFS on a grid. What it actually tests: (a) do you see that “for each cell, check reachability to oceans” is the naive O((MN)²) trap, (b) can you invent (or recognize) the inversion, (c) do you implement multi-source BFS (queue seeded with all border cells, not one), (d) do you state the visited-on-enqueue invariant.


🔗 https://leetcode.com/problems/pacific-atlantic-water-flow/

STOP. 25-min timer. Try naive first — see the complexity wall — then pivot.


3. Prerequisite Concepts


4. How to Approach (FRAMEWORK Steps 1–9)

Step 1 — Restate: “Given M×N heights, return all [r, c] from which water can reach BOTH the Pacific (top/left) AND the Atlantic (bottom/right) by flowing only to ≤-height neighbors.”

Step 2 — Clarify:

  • “Equal-height neighbors — does water flow?” (Yes — “equal or lower.”)
  • “What if grid is empty or 1-row / 1-col?” (1×1: the cell touches both → answer is itself. 1-row: every cell touches top edge = Pacific AND bottom edge = Atlantic → all cells.)
  • “Output order?” (LC accepts any.)

Step 3 — Constraints: M, N ≤ 200. So 40k cells. O((MN)²) = 1.6e9 → likely TLE. O(MN) is required.

Step 4 — Examples: LC’s classic 5×5 grid [[1,2,2,3,5],[3,2,3,4,4],[2,4,5,3,1],[6,7,1,4,5],[5,1,1,2,4]][[0,4],[1,3],[1,4],[2,2],[3,0],[3,1],[4,0]].

Step 5 — Brute Force: For each cell (r,c), run BFS forward (move to ≤ neighbors) and check whether we reach any Pacific cell AND any Atlantic cell. O((MN)²). State it; show it’s too slow; pivot to the inverted multi-source BFS.

Step 6 — Complexity: O(MN) time (each cell visited at most twice across both BFS runs). O(MN) auxiliary for the two visited sets.

Step 7 — Pattern Recognition: “Reachability from many sources” → multi-source BFS. “Inversion of the question because forward-from-every-cell is too slow” → inversion pattern. Companions: LC 994 (Rotting Oranges — multi-source BFS), LC 542 (01 Matrix — multi-source BFS from all zeros), LC 1162 (As Far From Land), LC 286 (Walls and Gates).

Step 8 — Develop:

def pacific_atlantic(heights):
    if not heights or not heights[0]: return []
    m, n = len(heights), len(heights[0])
    pac = [[False]*n for _ in range(m)]
    atl = [[False]*n for _ in range(m)]

    def bfs(starts, reachable):
        q = deque(starts)
        for (r, c) in starts: reachable[r][c] = True
        while q:
            r, c = q.popleft()
            for dr, dc in DIRS:
                nr, nc = r+dr, c+dc
                if (0 <= nr < m and 0 <= nc < n
                    and not reachable[nr][nc]
                    and heights[nr][nc] >= heights[r][c]):  # ≥ because we're walking REVERSED
                    reachable[nr][nc] = True
                    q.append((nr, nc))

    pac_starts = [(0, c) for c in range(n)] + [(r, 0) for r in range(m)]
    atl_starts = [(m-1, c) for c in range(n)] + [(r, n-1) for r in range(m)]
    bfs(pac_starts, pac)
    bfs(atl_starts, atl)

    return [[r, c] for r in range(m) for c in range(n) if pac[r][c] and atl[r][c]]

Step 9 — Test: Empty, 1×1, 1-row, all-equal-height, monotone increasing, the LC canonical example.


5. Progressive Hints

If stuck for 5 min, open hints.md.


6. Deeper Insight — The Inversion Trick

The reason this problem is interview-defining is that the obvious algorithm (BFS forward from every cell, check reachability) is too slow but feels right. Senior+ candidates pivot to the inversion:

“Instead of asking ‘from which cells can I reach the Pacific?’, ask ‘from the Pacific edges, which cells can be reached by going UPHILL?’ The reverse of a flow relation is reachable by reversing the direction comparison.”

Why this works: if (r, c) can flow to a Pacific cell via path (r,c) → x₁ → x₂ → ... → P, then reversing every edge: P → ... → x₂ → x₁ → (r,c). The reverse-edge condition is “neighbor is higher or equal” instead of “neighbor is lower or equal.”

The multi-source aspect: the Pacific is not one cell; it’s a whole set of border cells. So we BFS from ALL of them in parallel. Seed the queue with every Pacific border cell up front (marking each visited at enqueue), then run a single BFS. The classic mistake is to BFS once per source — that’s K separate BFSes, total O(K·MN) instead of O(MN).

This same inversion pattern shows up everywhere:

  • LC 994 (Rotting Oranges): seed BFS with all rotten oranges, expand.
  • LC 542 (01 Matrix): seed with all 0s, BFS outward to compute distance.
  • LC 1162 (As Far From Land): seed with all land, BFS outward.
  • LC 286 (Walls and Gates): seed with all gates, BFS outward.

The general lesson: when “for every starting cell, compute X” is too slow, try inverting and starting from the destination instead.


7. Anti-Pattern Analysis

  1. Sticking with naive O((MN)²). Times out on LC. Worse: the interview signal of pivoting to the inversion is the entire point — if you don’t pivot, you fail the question regardless of whether your naive code “works.”
  2. Forgetting to invert the comparison. When BFSing from the ocean, the comparison flips: neighbors must be current height (uphill), not ≤. Half of bug submissions have this wrong.
  3. One BFS per border cell instead of seeded multi-source BFS. Same complexity (O(MN) per BFS × K border cells = O(MN²) or worse). The trick is ONE BFS with K initial seeds.
  4. Marking visited on dequeue. Queue grows quadratic when many neighbors all enqueue the same cell. Mark on enqueue.
  5. Hand-rolling the border-cell construction with subtle off-by-one. Easy to double-count corners. [(0, c) for c in range(n)] + [(r, 0) for r in range(m)] double-counts (0, 0) but the visited-set check renders that harmless. State this explicitly.
  6. Mutating heights as the visited marker. Tempting but destructive AND impossible here because you need the actual heights for comparison. Use a separate bool matrix.

8. Skills & Takeaways

  • The inversion pattern — “for every source, compute reachability” → “from the destination(s), compute reverse reachability.”
  • Multi-source BFS — seed with K cells upfront; single BFS; O(V+E).
  • Comparison inversion — flow rule “≤” becomes reverse-traversal “≥”. Easy to get wrong; comment it.
  • Two-set intersection on grids — when you need “satisfies both A and B,” compute reachable sets for A and B independently, then intersect.
  • Border-cell construction — corner double-counts are harmless under visited dedup; mention this aloud.

9. When to Move On

Move on when:

  • You can write multi-source BFS with both Pacific and Atlantic BFS in under 8 minutes.
  • You can explain WHY the comparison flips when you reverse the BFS direction.
  • You can articulate three other problems that use multi-source BFS (LC 994, 542, 1162).
  • Your stress test runs both the brute (per-cell forward BFS) and the inverted multi-source BFS on 100 random grids and confirms identical output.
  • You can sketch the DFS version of the multi-source inversion (functionally identical; just a recursive flood from each border cell).

10. Company Context

Amazon:

  • High frequency. Often the “Round 3” question after warmup graph problems.
  • Bar Raiser will ask the brute solution first (“just to make sure you can write the trivial one”), then drive you to the inversion: “this is O((MN)²); can we do better?”
  • Follow-up: “now there are three oceans; cells must reach all three.” Trivial extension — three BFS sets, three-way intersection.
  • Scorecard phrase: “Algorithm + Earn Trust — recognized the brute force was too slow, pivoted to inversion cleanly.”

Google:

  • Skips the brute. Goes straight to “design the algorithm assuming the brute is too slow.” Tests the pivot-skill directly.
  • Will ask correctness proof: “prove the inversion gives the same set as the forward BFS.” Phrase the proof as “edge reversal preserves reachability between the same node pair.”
  • Scorecard phrase: “Algorithm — produced O(MN) via inversion + multi-source BFS, with proof.”

Meta:

  • Asks the variant where heights are continuous (floats) instead of ints. Algorithm unchanged — but tests whether you blindly assume integer comparison.
  • Scorecard phrase: “Solid — recognized algorithm carries over.”

Microsoft:

  • Often combines: “do Pacific Atlantic, then return the cells reachable to only one ocean.” Trivial post-processing on the two sets.
  • Scorecard phrase: “Strong — built two-set machinery, extended cleanly.”

Bloomberg:

  • Asks variant in a domain frame (“data center cooling — heat flows from high-temp cells to lower-temp neighbors; which cells eventually radiate heat to both edges?”). Same algorithm.
  • Scorecard phrase: “Pragmatic — translated domain problem.”

Apple:

  • Has been known to ask the 3D variant (water flows in cubic grid). Algorithm extends to 6-direction DIRS and 6-face start-sets. Discuss the extension; LC version stays 2D.
  • Scorecard phrase: “Strong — handled algorithm + sketched 3D extension.”

11. Interviewer’s Lens

SignalJuniorMidSeniorStaff/Principal
First instinctBFS per cellBFS per cell, sees O((MN)²)Inversion in 2 minInversion + names multi-source BFS pattern
Comparison flipWrongRight after one debugRight + comments whyRight + writes formal proof
Multi-sourceOne BFS per border cellSeeded queueSeeded queue + visited-on-enqueueMentions LC 994/542/1162 as same pattern
VariantsStuck“Other shapes work too”Solves 3-ocean variantSolves 3D extension and per-ocean filtering

12. Level Delta

Mid (L4):

  • Multi-source BFS, both Pacific and Atlantic.
  • Correct comparison inversion (≥ instead of ≤).
  • Correct intersection logic.

Senior (L5):

  • Notes “naive is O((MN)²)” before coding, pivots to inversion.
  • Both BFS and DFS variants.
  • Names “multi-source BFS” as the pattern; cites at least one analog (LC 994 or 542).
  • Comments the comparison flip in the code.

Staff (L6):

  • Discusses proof of equivalence (edge reversal preserves directed reachability between endpoints).
  • Solves 3-ocean extension and “reachable to only one” extension on demand.
  • Discusses memory: two N²-size bool matrices vs bitmap encoding for very large grids.
  • Discusses parallelism: the two BFS runs are independent; can fork to two threads. Final intersection is O(MN).

Principal (L7+):

  • Reframes as computing the “drainage basin” of two boundary regions. Reframe makes the algorithm obvious.
  • Discusses how GIS systems (ArcGIS, QGIS) implement watershed delineation using this exact algorithm at planet scale.
  • Discusses how this generalizes to weighted edges (e.g., flow rate based on slope) → Dijkstra-style multi-source shortest-path on a min-cost graph.
  • Discusses streaming variant: heights arrive cell-by-cell; how to maintain the answer incrementally (harder; touches on dynamic graph reachability).

13. Follow-up Q&A

Q1: How would you adapt this if there were three oceans (Pacific top edge, Atlantic right edge, Arctic bottom-left corner)? A: Run three independent multi-source BFSes, one per ocean. Result = cells in all three reachable sets. O(MN) time, O(MN) per set.

Q2: How would you find cells that drain to EXACTLY one of the two oceans? A: After computing pac and atl sets, return cells in pac XOR atl. One pass.

Q3: Prove the inversion is correct. A: Define forward reachability: (r,c) → P iff there’s a path through ≤-height neighbors ending at any Pacific cell. Define reverse reachability: P → (r,c) iff there’s a path through ≥-height neighbors starting from any Pacific cell. Path (r,c) = v₀ → v₁ → ... → vₖ = P exists forward iff heights[vᵢ₊₁] ≤ heights[vᵢ] for all i. Reverse the path: vₖ → vₖ₋₁ → ... → v₀. Now heights[vᵢ] ≥ heights[vᵢ₊₁] is exactly the reverse condition. So forward reachability (r,c) → P iff reverse reachability P → (r,c). QED.

Q4: What if the grid is too large to fit two visited matrices in memory? A: Use bit-packed visited (1 bit per cell instead of 1 byte → 8× compression). Or stream the grid in horizontal bands, but then BFS doesn’t fit neatly into the band; you’d need a more sophisticated “ghost cell” multi-pass algorithm. For 200×200 LC inputs, just use bool matrices.

Q5: Heights become floats. Does anything change? A: No. Comparison still works on floats. Only watch out for NaN — if heights can be NaN, define ordering carefully (probably treat NaN cells as impassable).


14. Solution Walkthrough

See solution.py:

  • pacific_atlantic_bfs(heights) — multi-source BFS, canonical.
  • pacific_atlantic_dfs(heights) — multi-source DFS (recursive, same algorithm).
  • pacific_atlantic_brute(heights) — O((MN)²) per-cell forward BFS, used as oracle.
  • stress_test() — 100 random grids up to 15×15, all three implementations return the same set (compared as sorted lists of tuples).
  • edge_cases() — empty, 1×1, 1-row, 1-col, all-equal, monotone, LC canonical 5×5.

Run: python3 solution.py → “ALL TESTS PASSED”.


15. Beyond the Problem

Principal-engineer code review comment:

“The inversion is correct and elegant. But please name it. The pattern is ‘multi-source BFS with edge reversal’ — that name should appear in the docstring. The next engineer who sees this in three years will recognize the family instantly: rotting oranges, 01 matrix, walls and gates, and yes, every watershed-delineation tool in GIS land. They are ALL the same algorithm; the only thing that varies is the seed set and the edge predicate. If you ever generalize this for our terrain pipeline, please write multi_source_bfs(seeds, predicate) once and reuse — don’t copy-paste this loop into four files.”

Connections forward:

  • p25 (Word Ladder) — BFS on an implicit graph; sister pattern.
  • LC 994 (Rotting Oranges), LC 542 (01 Matrix), LC 1162 (As Far From Land), LC 286 (Walls and Gates), LC 934 (Shortest Bridge) — all multi-source BFS.
  • Production: GIS watershed delineation, image segmentation (region growing from seeds), epidemic modeling (multi-source SIR), network reachability from a set of failure points.
  • Theoretical: the “reverse reachability” trick is the foundation of backward-chaining in AI search (start from the goal, work backwards to the start).