Week 5 — Graphs Intro: BFS, DFS, and the Topological Mindset

Theme: A graph is just nodes + edges, but the mental shift from trees to graphs is the visited set. Without it, every graph algorithm loops forever. Every problem this week is, at its core, “the right traversal with the right state.” Time budget: ~8 hours focused work; ~14 hours including the LC bank. Hard rule: every graph problem solved twice — once with BFS, once with DFS. State which version you’d ship and why (memory shape, recursion depth, level-distance need).


Why this week

Weeks 3–4 gave you tree recursion fluency. Trees are connected, acyclic, and rooted — three luxuries that disappear in graphs. This week strips each one away:

  1. Implicit graphs are the norm. Most interview “graph” problems are NOT given an adjacency list. The grid in Number of Islands, the prerequisite list in Course Schedule, the word-ladder dictionary — you’re handed input that encodes a graph. Recognizing the implicit graph is half the problem. The faster you reframe “this is a graph where nodes are X and edges are Y,” the faster you solve it.
  2. The visited set is non-negotiable. Trees can’t cycle. Graphs can. Every BFS/DFS this week needs explicit cycle-prevention. The mistake of “forgetting visited” causes 30% of failed graph attempts in real interviews. Burn the discipline in this week.
  3. BFS distance = unweighted shortest path. Once you internalize that the level number in BFS is the distance from source, half of “shortest path” problems become BFS + level snapshot — the exact pattern from p20. Word Ladder, Rotting Oranges, Open the Lock — they’re all the same skeleton.
  4. Topological sort = “cycle detection + ordering” combo. Course Schedule is the gateway. Kahn’s BFS variant and DFS-coloring variant are both ~20 lines; both are reusable templates that recur in build systems, scheduler kernels, and ML pipelines.
  5. Multi-source BFS is a 1-line variant. Pacific Atlantic teaches the inversion trick — instead of asking “can water from each cell reach the ocean?” (NM separate BFSes), ask “from the ocean, which cells can be reached?” (one BFS per ocean from all border cells simultaneously). This kind of reframing is what distinguishes Staff from Senior.

By Friday you should: write BFS and DFS on a grid from memory in under 90 seconds each, implement both Kahn’s and DFS-coloring topo sort, articulate why bidirectional BFS halves Word Ladder runtime, and recognize the “multi-source BFS from boundary” pattern.


Flagship problems (the 5 you OWN this week)

#ProblemLCDifficultyWhy this one
p21Number of Islands200MediumThe canonical grid-as-graph problem. Connected components via DFS or BFS. Every FAANG asks this.
p22Clone Graph133MediumForces you to handle cycles explicitly via memo map. Both DFS and BFS variants are instructive.
p23Course Schedule207MediumCycle detection + topological sort. Kahn’s BFS vs DFS-3-color. Gateway to build systems.
p24Pacific Atlantic Water Flow417MediumMulti-source BFS from boundary + the inversion trick. Teaches “reframe to avoid N independent searches.”
p25Word Ladder127HardBFS on an implicit graph; pattern-bucket adjacency trick; bidirectional BFS for the follow-up.

LC bank (~25 problems for spaced repetition)

Grid traversal (connected components, flood-fill, area):

  • LC 200 (p21), LC 695 Max Area of Island, LC 463 Island Perimeter, LC 733 Flood Fill, LC 1254 Closed Islands, LC 130 Surrounded Regions, LC 1020 Number of Enclaves

Graph cycle / topo sort:

  • LC 207 (p23), LC 210 Course Schedule II, LC 269 Alien Dictionary (Hard), LC 802 Find Eventual Safe States, LC 310 Minimum Height Trees

Multi-source BFS / boundary BFS:

  • LC 417 (p24), LC 994 Rotting Oranges, LC 1162 As Far from Land as Possible, LC 542 01 Matrix

Implicit-graph BFS shortest path:

  • LC 127 (p25), LC 126 Word Ladder II (all shortest paths — Hard), LC 752 Open the Lock, LC 433 Min Genetic Mutation, LC 854 K-Similar Strings (Hard)

Graph cloning / serialization:

  • LC 133 (p22), LC 138 Copy List with Random Pointer (same memo-map pattern), LC 1485 Clone Binary Tree with Random Pointer

Readiness gate (you should be able to answer ALL these out loud before moving on)

  1. What’s the visited set, and why can’t graphs skip it? Because graphs can have cycles; without marking visited, BFS/DFS enters an infinite loop. Trees can’t cycle, so they never needed it.
  2. When do you choose BFS over DFS, and vice versa? BFS when you need shortest distance in an unweighted graph, or when the answer is “level-by-level” (Pacific Atlantic uses level-by-level reachability). DFS when you need cycle detection (use 3-color), connected components on a non-distance question (Number of Islands), or when the recursion depth is bounded (deeper-than-wide problems).
  3. What’s the canonical grid-BFS template? q = deque([start]), visited = {start}, while q: pop, for each of 4 neighbors that’s in-bounds and unvisited, mark visited and enqueue. Memorize cold.
  4. In topological sort, what’s the difference between Kahn’s BFS variant and DFS-coloring variant? Kahn’s: count in-degrees, start with all in-degree-0 nodes, decrement in-degrees as you “remove” nodes; if final order has fewer than N nodes → cycle exists. DFS-coloring: WHITE/GRAY/BLACK; entering GRAY → graph already explored → cycle.
  5. Why does multi-source BFS (Pacific Atlantic, Rotting Oranges) work? Because BFS expands by distance from the frontier; initializing the frontier with multiple sources is equivalent to running |sources| BFSes in parallel and merging the wavefronts. Same O(V+E), no double work.
  6. What’s the pattern-bucket trick in Word Ladder? Adjacency by edit-distance-1 is implicit. Instead of comparing every pair (O(N²·L)), bucket each word by “wildcarded” versions (h*t, ho*, *ot) so two words sharing a bucket are 1-edit-distance neighbors. O(N·L²) preprocessing, fast neighbor lookup.
  7. What does bidirectional BFS buy you? It searches simultaneously from start and end, meeting in the middle. Since BFS branching factor is b and shortest path length is d, naive BFS visits O(b^d) nodes; bidirectional visits 2 * O(b^(d/2)) = O(b^(d/2)). Square-root speedup. Worth knowing as a follow-up for Word Ladder.
  8. What does “implicit graph” mean, and what’s the cost? The graph isn’t materialized as an adjacency list; you compute neighbors on the fly from input. Cost: neighbor enumeration may be expensive (Word Ladder’s L²-per-neighbor without bucketing). Benefit: no O(V+E) memory to store the graph.

If you can’t articulate all 8, the week is not done.


Recurring traps

  1. Forgetting visited — infinite loop on the first cycle. Mark visited when enqueueing, not when dequeuing. (Dequeue-marking lets duplicates accumulate in the queue.)
  2. Mutating the input grid as a visited marker — fine for interviews IF you state it (“I’m flipping '1' to '0' to mark visited; the input is destroyed”) OR if you restore it. Otherwise the interviewer flags “destructive operation on shared state.”
  3. Recursion depth exhaustion on giant grids — DFS on a 1000×1000 all-'1' grid hits Python’s recursion limit. Use sys.setrecursionlimit(...) or switch to iterative BFS/DFS-with-explicit-stack.
  4. Forgetting to mark the start as visited before entering BFS — causes re-enqueueing of the source. Always: q = deque([start]); visited = {start} together.
  5. Off-by-one with neighbor directions — write DIRS = ((0,1),(0,-1),(1,0),(-1,0)) once at module top; never inline.
  6. Topo sort returns partial result on cyclic input — Kahn’s silently emits a short list when there’s a cycle. Always check len(result) == N and raise/return appropriately.

Stretch goals (only if everything above is solid)

  • LC 126 — Word Ladder II. Return all shortest transformation sequences. Builds parent-map during BFS, then DFS-backtracks. Hard.
  • LC 269 — Alien Dictionary. Topo sort on letters derived from word ordering. Easy to fail edge cases (prefix issue, single letters).
  • LC 815 — Bus Routes. BFS where each “level” = one bus ride. Implicit graph where buses are nodes, stops are hyperedges.

Pointer to the first problem

Start with p21 — Number of Islands. It’s the gentlest grid-as-graph entry; everything else this week builds on it.