p62 — Course Schedule II
Source: LeetCode 210 · Medium · Topics: Graph, Topological Sort, BFS, DFS Companies (2024–2025): Amazon (very high), Meta (high), Google (high), Microsoft (high), LinkedIn (medium-high) Loop position: Topological sort canonical. Tests Kahn’s BFS (indegree counter) AND DFS post-order — both expected. Build systems, task scheduling, dependency resolution all reduce here.
1. Quick Context
numCourses courses 0..n-1, prerequisites[i] = [a, b] meaning take b before a. Return a valid order, or [] if cycle exists (impossible).
Two equivalent algorithms:
Kahn’s BFS (indegree)
Compute indegree of every node. Push indegree-0 nodes onto a queue. Pop, append to result, decrement indegrees of neighbors; push any that drop to 0. If final result has all n nodes → return it; else cycle.
DFS post-order
DFS with three-color states (WHITE / GRAY / BLACK). On entering a node mark GRAY; recurse on neighbors; if we hit a GRAY neighbor → cycle. On finish mark BLACK and append to stack. Reverse the stack at the end.
Both O(V + E).
2. LeetCode Link + Attempt Gate
🔗 https://leetcode.com/problems/course-schedule-ii/
STOP. 25-min timer. Aim to know BOTH algorithms.
3. Prerequisite Concepts
- Adjacency list construction.
- BFS with
collections.deque. - DFS with explicit recursion + three colors (white/gray/black) for cycle detection.
4. How to Approach (FRAMEWORK 1–9)
1. Restate: Linear order respecting partial-order dependencies.
2. Clarify: Edges duplicated? Self-loop? [a, a] is a cycle. Disconnected components allowed?
3. Examples: n=2, prereqs=[[1,0]] → [0,1]; n=4, [[1,0],[2,0],[3,1],[3,2]] → e.g. [0,1,2,3] or [0,2,1,3]; with cycle [[0,1],[1,0]] → [].
5. Brute force: Try all n! permutations and check validity. n! infeasible.
6. Target: O(V + E).
7. Pattern: Topological sort.
8. Develop: see solution.py.
9. Test: empty prereqs; single course; chain; tree; cycle of length 2 and 3; disconnected components; self-loop.
5. Progressive Hints
See hints.md.
6. Deeper Insight
6.1 Edge direction convention — read it carefully
prerequisites[i] = [a, b] says “to take a, you must take b first.” This is a dependency from b → a. Build the adjacency list as adj[b].append(a) and the indegree of a increments.
Getting the direction wrong gives a topologically-sorted but BACKWARDS answer. The most common LC bug here.
6.2 Kahn’s BFS — full algorithm
1. Build adj list + indegree[]
2. queue = deque of all v with indegree[v] == 0
3. result = []
4. while queue:
v = queue.popleft()
result.append(v)
for u in adj[v]:
indegree[u] -= 1
if indegree[u] == 0:
queue.append(u)
5. return result if len(result) == n else []
Why correct: Every node with no remaining predecessors can safely come next. Cycle ⇔ some node never reaches indegree 0 ⇔ len(result) < n.
6.3 DFS post-order — full algorithm
Three colors: WHITE (unvisited), GRAY (in current DFS stack), BLACK (done).
def dfs(v):
if color[v] == GRAY: cycle = True; return
if color[v] == BLACK: return
color[v] = GRAY
for u in adj[v]:
dfs(u)
if cycle: return
color[v] = BLACK
order.append(v) # post-order push
for v in 0..n-1: dfs(v)
return [] if cycle else order[::-1]
Why correct: DFS finishes children before parents. Reversing post-order gives a sequence where every edge u→v has u before v. Cycle ⇔ encountering a GRAY node (back-edge to ancestor).
6.4 Why both algorithms are worth knowing
- Kahn’s BFS generalizes to “lexicographically smallest topo order” (use a
heapqpriority queue instead ofdeque). - DFS generalizes to strongly-connected components (Tarjan, Kosaraju), Euler tours, articulation points.
- Interviewers often follow up “what about cycle DETECTION only?” or “what if we want lex-smallest?” — having both algorithms in pocket answers either fork.
6.5 Subtleties
- Self-loop
[a, a]is a cycle of length 1 — both algorithms must detect. - Duplicate edges in
prerequisitesinflate indegrees if not deduped — Kahn’s still works (it just decrements multiple times), but the input usually doesn’t have dupes; mention if asked. - Disconnected components: both algorithms handle naturally (Kahn enqueues all 0-indegree starts; DFS loops over all nodes).
- Recursion depth: DFS on n=2000 may hit Python’s default recursion limit; use iterative DFS or
sys.setrecursionlimit(10**6)for safety.
6.6 Lex-smallest topological order
Replace deque with heapq (min-heap):
heap = [v for v in range(n) if indegree[v] == 0]
heapq.heapify(heap)
while heap:
v = heapq.heappop(heap)
result.append(v)
for u in adj[v]:
indegree[u] -= 1
if indegree[u] == 0:
heapq.heappush(heap, u)
O((V + E) log V). Standard follow-up.
6.7 Cycle reporting (not just detection)
Sometimes interviewers ask “return the cycle.” Modify DFS: when you hit a GRAY node, walk parent pointers backwards from current node to the GRAY node to recover the cycle. Useful in dependency-resolution diagnostics.
6.8 Production use cases
- Build systems (Make, Bazel, Ninja): topological order of compilation targets.
- Task schedulers (Airflow, Luigi, Dagster): DAG of operators.
- Spreadsheet engines (Excel, Google Sheets): cell evaluation order.
- Package managers (npm, pip, apt): install order respecting deps.
- DB query optimizers: rule application order.
Every one of these crashes / loops on a cycle, so detection is essential.
7. Anti-Pattern Analysis
- Wrong edge direction —
adj[a].append(b)instead ofadj[b].append(a). Produces a valid reverse-topo, fails the LC test. - Indegree decremented to negative — bug from duplicate edges or wrong direction.
- DFS without GRAY state — only WHITE/BLACK; cannot detect cycles, mistakes back-edges for cross-edges.
- Recursive DFS without raised recursion limit — silent crash for n > 1000.
- Forget to check
len(result) == nfor Kahn — partial result returned on a cycle, LC fails. - Use a set instead of indegree counter — works but adds log/hash overhead, makes follow-up “lex-smallest” awkward.
8. Skills & Takeaways
- Kahn’s BFS (indegree) and DFS post-order — both in pocket.
- Three-color DFS for cycle detection.
- Adjacency list + indegree counter pattern.
- Iterative DFS via explicit stack for large graphs.
- Lex-smallest variant via
heapq.
9. When to Move On
- Solve LC 207 (Course Schedule, cycle detection only) in 5 min.
- Solve LC 210 with BOTH algorithms in 15 min total.
- Solve LC 269 (Alien Dictionary) — topo sort over inferred char ordering.
- Solve LC 1857 (Largest Color Value in DAG) — topo + DP.
- Explain three-color DFS in 60 seconds.
10. Company Context
- Amazon: L5/L6; productized as “service deployment order across regions.”
- Meta: E5; LC 269 follow-up common.
- Google: L5; lex-smallest variant a frequent twist.
- Microsoft: Compiler / build-system teams.
- LinkedIn: L5; skill-tree / course-recommendation analogue.
Scorecard for strong: “Implemented both Kahn and DFS, articulated three-color cycle detection, mentioned lex-smallest extension via heap, related to build systems.”
11. Interviewer’s Lens
| Signal | Junior | Mid | Senior | Staff+ |
|---|---|---|---|---|
| Edge direction | Confused | Correct after prompt | Correct + articulates | + draws DAG to verify |
| Algorithm choice | One only | Both names | Both implementable | + chooses based on follow-up potential |
| Cycle detection | Forgets | Indegree check | + DFS GRAY state | + reports cycle, not just detects |
| Extensions | None | LC 207 | LC 269 alien dict | + Tarjan SCC + Kahn-with-heap lex-smallest |
12. Level Delta
- Mid (L4): Kahn’s BFS, correct edge direction, returns
[]on cycle. - Senior (L5): + DFS post-order + three-color cycle detection + lex-smallest with
heapq+ LC 269 alien dictionary. - Staff (L6): + iterative DFS to avoid recursion limit + Tarjan SCC awareness + reports actual cycle + production parallels (build systems).
- Principal (L7+): + parallel topological sort (Kahn’s BFS is naturally levelizable across cores) + incremental topo-sort (Pearce-Kelly 2007 algorithm for online edge insertions) + transitive closure / DAG reachability via DFS+memoization.
13. Follow-up Q&A
Q1: Return only “can it be done” (cycle detection).
A1: LC 207. Same algorithm; return len(result) == n (Kahn) or not cycle (DFS).
Q2: Lex-smallest topological order.
A2: Kahn’s with heapq instead of deque. O((V+E) log V).
Q3: All valid topological orders (not just one). A3: Backtrack on the set of “ready” (indegree-0) nodes; for each choice, decrement and recurse. Exponential in pathological cases (a DAG with no constraints has n! orders).
Q4: Detect AND report the cycle.
A4: In DFS, maintain parent pointers. On encountering GRAY neighbor u from v, walk parent chain from v back to u to recover the cycle.
Q5: What if prerequisites has 10^9 edges?
A5: Adjacency list won’t fit. Stream edges; if you can bound indegree-0 sets externally or process by batches; otherwise the problem isn’t tractable in-memory. Production: graph DBs (Neo4j) for huge dependency graphs.
14. Solution Walkthrough
See solution.py:
find_order_kahn— BFS with indegree counter anddeque.find_order_dfs— three-color DFS, post-order push, reverse at end.find_order_brute— try all permutations (stress n ≤ 6).
15. Beyond the Problem
Principal code review: “Topological sort is one of the dozen algorithms you’ll touch every week of an engineering career. Every build system, every CI pipeline, every workflow engine, every database query plan — they all reduce to this. The interview asks for an order; production asks for parallel scheduling (which subset can run concurrently?), incremental rebuilds (only redo what changed downstream?), cycle diagnostics (when a circular dep is added in code review, surface WHICH cycle?). Kahn’s algorithm naturally answers “what can run now” — it levelizes. That’s why Bazel’s job scheduler is essentially Kahn-with-a-thread-pool. Learn the algorithm; then learn its three or four most common extensions; you’ll use them forever.“