p70 — Reconstruct Itinerary

Source: LeetCode 332 · Hard · Topics: Graph, Eulerian Path, Hierholzer, DFS Companies (2024–2025): Google (high), Amazon (medium-high), Meta (medium), Uber (medium) Loop position: The Eulerian path problem in disguise. Tests Hierholzer’s algorithm — one of the named graph algorithms that distinguishes Senior from Mid. Plus the lex tie-break subtlety that catches everyone.

1. Quick Context

Given a list of flight tickets (departure, arrival), construct the trip that uses every ticket exactly once, starting from “JFK”. When multiple valid itineraries exist, return the lexicographically smallest.

Recognize: This is finding an Eulerian path (a walk that uses every edge exactly once) in a directed multigraph, with a lexicographic tie-break.

Algorithm: Hierholzer’s, with destinations processed in sorted order (min-heap or pre-sorted list). The clever twist: append nodes to the route in post-order during DFS, then reverse at the end.

O((V+E) log E) due to the heap; could be O(V+E) with bucket sort.

2. LeetCode Link + Attempt Gate

🔗 https://leetcode.com/problems/reconstruct-itinerary/

STOP. 35-min timer. The “reverse post-order” insight is non-obvious.

3. Prerequisite Concepts

DFS over a multigraph (multiple edges between same pair allowed).
Eulerian path conditions (we don’t need to verify — problem guarantees one exists).
Min-heap for lexicographic ordering of destinations.
Why post-order + reverse = correct Eulerian path output.

4. How to Approach (FRAMEWORK 1–9)

1. Restate: Find an Eulerian path starting at JFK, lex-smallest among valid ones.

2. Clarify: Always solvable? (LC promises yes.) Multiple tickets between same pair? (Yes — directed multigraph.) Output one itinerary.

3. Examples:

[["MUC","LHR"],["JFK","MUC"],["SFO","SJC"],["LHR","SFO"]] → ["JFK","MUC","LHR","SFO","SJC"].
[["JFK","SFO"],["JFK","ATL"],["SFO","ATL"],["ATL","JFK"],["ATL","SFO"]] → ["JFK","ATL","JFK","SFO","ATL","SFO"] (lex-smallest among multiple valid).

5. Brute force: Try all permutations of tickets; for each, check it forms a valid path starting JFK. O(N!). Only for small inputs.

6. Target: O((V+E) log E) via Hierholzer.

7. Pattern: Hierholzer’s Eulerian path with post-order DFS + lex tie-break via heap.

8. Develop: see solution.py.

9. Test: single ticket; loop (returns to JFK); branching where greedy choice leads to dead-end (Hierholzer recovers via post-order).

5. Progressive Hints

See hints.md.

6. Deeper Insight

6.1 Why greedy lex-smallest DFS DOESN’T work directly

Naive: at each step go to the lex-smallest unvisited destination. Counterexample: at JFK with destinations {ATL, KUL} where ATL leads to a dead end but KUL has more edges. Greedy goes ATL first → can’t finish → wrong.

The correct algorithm:

Recursive DFS, but APPEND TO ROUTE AT THE END (post-order), not the beginning (pre-order).
This guarantees that any “dead-end” subgraph gets added LAST in the right order; the longer paths get explored fully first.

6.2 Why post-order + reverse works (the proof intuition)

When DFS hits a dead end (no more outgoing edges from current node), that node MUST be the LAST node in the itinerary (or last in its subtour). So append at the end of route. As DFS unwinds, nodes are added in reverse order. Reverse the entire route at the end.

Formally: this is Hierholzer’s algorithm. It works on any graph with an Eulerian path; the lex requirement adds the heap.

6.3 Heap of destinations for lex tie-break

For each node, store outgoing destinations in a min-heap. Each DFS step pops the smallest destination, removes that edge (decrements edge count), recurses.

Without the heap, you’d sort destinations once, but as edges are consumed you’d need to re-find the next smallest — heap is cleaner.

6.4 Iterative Hierholzer

The recursive version is elegant but the iterative version is the standard production form:

stack = [start]
route = []
while stack:
    u = stack[-1]
    if heap[u]:
        v = heappop(heap[u])
        stack.append(v)
    else:
        route.append(stack.pop())
return reversed(route)

Same post-order + reverse, but using explicit stack. Beautiful in its compactness.

6.5 Eulerian path conditions (when you’d verify)

A directed graph has an Eulerian PATH iff:

At most one vertex has out_degree - in_degree = 1 (the start).
At most one vertex has in_degree - out_degree = 1 (the end).
All other vertices have equal in/out degree.
The graph is connected (considering only vertices with edges).

LC 332 promises this; in interview you may be asked to verify.

6.6 Eulerian CYCLE vs PATH

Cycle: starts and ends at same vertex; all vertices have in_deg = out_deg. Path: starts and ends at different vertices; exactly two vertices have unequal in/out.

Hierholzer’s works for both; the LC problem could be either depending on tickets.

6.7 Chinese Postman Problem (the general case)

If you’re allowed to traverse edges more than once (and want min total), it’s the Chinese Postman Problem — solvable in poly time for undirected, harder for directed (called “windy postman”). Out of scope for LC but worth knowing the name.

6.8 Lex-smallest via destination ordering — sufficient or not?

Claim: with destinations processed in lex order at each step, Hierholzer’s post-order + reverse gives the lex-smallest Eulerian path. The proof uses an exchange argument — out of scope but worth knowing the result holds.

7. Anti-Pattern Analysis

Greedy lex-smallest pre-order DFS — fails on the standard counterexample; gives wrong itinerary.
Use the heap but forget to RE-INSERT when backtracking (rare bug; correct algorithm doesn’t backtrack at the edge level since post-order handles it).
Sort destinations once, walk in order — heap is the clean tool because edges are popped non-trivially.
Build adjacency as a Counter without sort — lose lex order.
Recursive DFS with deep itineraries — Python recursion limit. LC 332 inputs are small; iterative is still safer.
Append nodes in pre-order, reverse — wrong; pre-order doesn’t have the right property.

8. Skills & Takeaways

Eulerian path recognition: “use every edge exactly once” → Hierholzer.
Post-order DFS + reverse — a counterintuitive but correct pattern.
Heap for lexicographic processing under edge removal.
Iterative Hierholzer with explicit stack.

9. When to Move On

Solve LC 332 cold in 30 min using iterative Hierholzer.
Explain why pre-order greedy fails (the dead-end-first counterexample) in 2 minutes.
Recognize Eulerian path/cycle in disguise (e.g., domino chains, DNA assembly fragments).

10. Company Context

Google: L5/L6; classic “named algorithm” probe.
Amazon: L6; logistics/routing flavor.
Uber: L5; routing/dispatch context.
Meta: E5; less common, harder DFS variant.

Scorecard for strong: “Recognized Eulerian path; named Hierholzer; explained why post-order; wrote iterative version; cited the dead-end counterexample for naive greedy.”

11. Interviewer’s Lens

Signal	Junior	Mid	Senior	Staff+
Recognition	DFS attempt	“Eulerian-ish”	Names Hierholzer	+ lex proof + LC variants
Greedy bug	Doesn’t see	After failing test	Anticipates	+ cites counterexample upfront
Iterative form	None	Recursive only	Iterative	+ Eulerian path conditions verified
Generalization	None	None	Notes Euler cycle vs path	+ Chinese Postman + DNA assembly + de Bruijn graphs

12. Level Delta

Mid (L4): Recursive Hierholzer; works on small inputs; misses iterative.
Senior (L5): Iterative Hierholzer + lex-smallest via heap + cites greedy counterexample + post-order intuition.
Staff (L6): + Eulerian path/cycle conditions + verifies feasibility + de Bruijn graph application (DNA assembly).
Principal (L7+): + Chinese Postman generalization + production routing (vehicle routing problem framing) + the Bose-Hierholzer history.

13. Follow-up Q&A

Q1: What if no Eulerian path exists? A1: Check in/out degree conditions and connectivity first; return [] or error.

Q2: What if input has duplicates (multiple identical tickets)? A2: Multigraph; treat as separate edges. Heap with repeats handles it naturally.

Q3: Find the lex-LARGEST itinerary instead. A3: Use max-heap (negate keys) instead of min-heap.

Q4: How is this related to de Bruijn sequences? A4: A de Bruijn sequence is an Eulerian cycle on a de Bruijn graph; same algorithm. DNA fragment assembly uses this.

Q5: Why does post-order + reverse work but pre-order doesn’t? A5: Post-order ensures dead-end subtours get added last (correct position in itinerary). Pre-order commits to a path before knowing if it leads to a dead end early.

14. Solution Walkthrough

See solution.py:

find_itinerary_hierholzer — recursive Hierholzer with min-heap; canonical.
find_itinerary_iterative — iterative Hierholzer with explicit stack.
find_itinerary_brute — permutation enumeration (only for stress with ≤6 tickets).

15. Beyond the Problem

Principal code review: “Hierholzer’s algorithm (1873) is one of those discoveries that crystalizes the post-order DFS pattern into a named technique. The intuition — ‘when you dead-end, you must be at the last node’ — is profound and counterintuitive; pre-order DFS feels natural but fails. The same algorithm powers DNA fragment assembly (de Bruijn graphs in computational biology), printer-driver-style flood fill of plotter commands, and a variety of routing problems. The Chinese Postman generalization (visit every edge with minimum total weight) extends to vehicle routing and street-sweeper scheduling. When you see ‘traverse every edge exactly once,’ the answer is always Hierholzer; the only question is which lex tie-break the problem wants. Memorize the iterative form — it’s seven lines and you’ll be asked it again.”

LeetCode Interview — Extreme Coding