p66 — Min Cost to Connect All Points
Source: LeetCode 1584 · Medium · Topics: MST, Kruskal, Prim, Union-Find Companies (2024–2025): Amazon (high), Google (high), Microsoft (medium-high), LinkedIn (medium) Loop position: MST canonical. Tests the two classical MST algorithms (Kruskal + Prim) and your judgment on which to pick given density. Often phrased as “connect houses with minimum cable” or “minimum-cost network.”
1. Quick Context
n points in 2D. Cost to connect points i, j = Manhattan distance |xi−xj| + |yi−yj|. Return min total cost to connect all points (so the graph is connected — equivalently, build a spanning tree of min weight).
This is the MST problem on a complete graph with V² edges.
Prim (better here — graph is dense, V² edges):
in_mst = [False] * n
min_cost = [INF] * n
min_cost[0] = 0
total = 0
for _ in range(n):
u = argmin over not-in-mst of min_cost
in_mst[u] = True
total += min_cost[u]
for v in range(n):
if not in_mst[v]:
min_cost[v] = min(min_cost[v], dist(u, v))
return total
O(V²) — perfect for V² edges. Heap-based Prim is O(E log V), Kruskal is O(E log E). With V² edges all are similar; array-Prim wins by constant.
2. LeetCode Link + Attempt Gate
🔗 https://leetcode.com/problems/min-cost-to-connect-all-points/
STOP. 30-min timer.
3. Prerequisite Concepts
- Spanning tree definition: connected, acyclic, V-1 edges, spans all V nodes.
- MST: spanning tree of minimum total edge weight (unique if all weights distinct).
- Cut property: lightest edge crossing any cut belongs to some MST.
- Cycle property: heaviest edge in any cycle is NOT in any MST.
- Union-Find (p65) for Kruskal; heap or array for Prim.
4. How to Approach (FRAMEWORK 1–9)
1. Restate: MST on the complete graph induced by Manhattan distances.
2. Clarify: Distance is Manhattan (problem-stated). All points distinct? (Often yes; even if not, MST is well-defined.) Allowed duplicates? Same answer.
3. Examples: [[0,0],[2,2],[3,10],[5,2],[7,0]] → 20. Construct by hand: edges (0,0)→(2,2)=4, (2,2)→(5,2)=3, (5,2)→(7,0)=4, (5,2)→(3,10)=9. Total 20.
5. Brute force: Enumerate all spanning trees (n^(n-2) by Cayley’s formula). Infeasible.
6. Target: O(V²) via Prim with array (since edge count = V²).
7. Pattern: MST. Choose Prim (array) for dense graphs, Kruskal for sparse.
8. Develop: see solution.py.
9. Test: n=1 → 0; n=2 → distance between them; collinear points; coincident points.
5. Progressive Hints
See hints.md.
6. Deeper Insight
6.1 Why Prim wins for this problem
Edge count = V·(V-1)/2 ≈ V². Kruskal is O(E log E) = O(V² log V). Prim-with-heap is O((V + E) log V) = O(V² log V). Prim-with-array is O(V²) — strictly better when E ≈ V². On LC’s max constraints (V ≤ 1000), V² = 10⁶ vs V² log V ≈ 2·10⁷. The constant matters.
For SPARSE graphs (E = O(V)), Kruskal or heap-Prim win.
6.2 Prim’s correctness — the cut property
Cut property: For any cut (partition of vertices into S and V\S), the cheapest edge crossing the cut belongs to some MST.
Prim maintains S = set of nodes already in MST. At each step, it adds the cheapest edge from S to V\S. By the cut property, this edge is safe. After V−1 additions, S = V; we have an MST.
6.3 Kruskal’s correctness — also cut property, but globally
Algorithm: Sort edges by weight. For each in order, if its endpoints are in different DSU components, add to MST and union. Skip otherwise.
Why it works: When you consider edge e = (u, v), the partition (component of u) | rest defines a cut. Since e is the next-smallest edge and you haven’t yet connected this cut, e is the cheapest edge crossing it — safe by cut property.
6.4 Array-Prim implementation detail
min_cost[v] = “weight of cheapest edge from {already-in-MST} to v.” Initialize min_cost[0] = 0, all others INF. In each iteration:
- Pick
u= unfinished node with smallestmin_cost[u]. - Add
min_cost[u]to total. - Mark
ufinished. - For each unfinished
v, updatemin_cost[v] = min(min_cost[v], dist(u, v)).
Repeats V times; inner loops O(V). Total O(V²).
6.5 Heap-Prim subtlety
Heap-Prim pushes (weight, v) for each newly-improvable neighbor; doesn’t decrease-key (Python heapq lacks it). May push duplicates; on pop, skip if v is finished. Time: O(E log V) amortized. Cleaner code but slower constants here.
6.6 Manhattan-distance MST in O(V log V)
For Manhattan-distance MST in 2D, there’s a clever O(V log V) algorithm using sweep + KD-tree. Each point has at most 8 “candidate” neighbors in the MST (one per octant). Overkill for LC; mention only at Staff+. The key insight: in L1, the MST is sparse despite the complete underlying graph.
6.7 Variants and friends
- LC 1135 Connecting Cities with Minimum Cost: explicit edges given → Kruskal.
- LC 1631 Path with Minimum Effort: minimax path = MST or binary search.
- LC 1168 Optimize Water Distribution: add a super-source for “well” costs, then MST.
- Steiner Tree: MST when only a SUBSET of nodes must be connected. NP-hard.
6.8 Borůvka’s algorithm (third MST algorithm)
Each iteration: every component picks its cheapest outgoing edge; add all unique ones. Halves component count per round → O(E log V). Parallelizable. Original 1926 algorithm. Worth knowing the name at Staff+.
7. Anti-Pattern Analysis
- Build explicit edge list of V² edges in Python list of tuples for Kruskal — uses ~10⁸ memory for V=10⁴; OOM. Prefer array-Prim.
- Heap-Prim without staleness check — popping a stale
(w, v)and re-processing v double-counts. - Forget
dist(u, u) = 0in array-Prim — handled automatically if you initializemin_cost[0] = 0and add it; common bug is to start total atdist(0, ?)instead of 0. - Use Euclidean distance (square root) — problem says Manhattan; passing tests on a different metric is silently wrong.
- Recompute distance on every loop iteration — fine for clarity at V≤1000; precomputing matrix uses V² extra memory but trades for speed.
- Mix up MST with shortest path — they’re different problems! MST minimizes total tree weight; SSSP minimizes per-vertex distance from a root.
8. Skills & Takeaways
- Two algorithms (Kruskal, Prim), one MST.
- Density determines which is faster: dense → array-Prim, sparse → Kruskal.
- Cut property as the core correctness lemma for both.
- Build complete graphs implicitly (compute distances on the fly) rather than materializing edge lists when V² is large.
9. When to Move On
- Implement array-Prim and Kruskal+DSU cold in 12 min each.
- State the cut property and its role in both proofs in 60 seconds.
- Solve LC 1584, LC 1135, LC 1168, LC 1631 in 25 min each.
- Pick the right algorithm given density without prompting.
10. Company Context
- Amazon: L5/L6; “warehouse network design” / “AWS region interconnect cost.”
- Google: L5/L6; data-center cabling, fiber-laying problems.
- Microsoft: L65+; Azure network topology.
- LinkedIn: L5; cluster formation in connectivity graphs.
Scorecard for strong: “Recognized MST, picked Prim for density reasons, cited cut property, implemented cleanly, mentioned Kruskal as alternative + Steiner tree if subset constraint added.”
11. Interviewer’s Lens
| Signal | Junior | Mid | Senior | Staff+ |
|---|---|---|---|---|
| Recognize MST | “It’s connecting things” | Names MST | + picks algorithm by density | + Borůvka, Steiner, Manhattan MST O(V log V) |
| Correctness | Hand-wavy | Mentions greedy | Cites cut property | + sketches proof + cycle property |
| Implementation | Buggy | One works | Both clean | + array-Prim O(V²) optimization |
| Variants | None | LC 1135 | + LC 1168 super-source trick | + Steiner tree NP-hardness + LP relaxation |
12. Level Delta
- Mid (L4): One MST algorithm correct (Prim or Kruskal); recognizes the problem.
- Senior (L5): + both algorithms + density justifies choice + LC 1168 super-source pattern.
- Staff (L6): + Borůvka’s algorithm + Steiner tree NP-hardness + Manhattan MST O(V log V) sketch + LC 1631 minimax connection.
- Principal (L7+): + LP relaxation framing of MST + production network-design pipelines + multi-criteria MST (cost + latency) + distributed MST (GHS algorithm 1983).
13. Follow-up Q&A
Q1: What if some edges are mandatory (must be in the tree)? A1: Contract them first (union endpoints in DSU; cost += sum). Then run MST on the remainder.
Q2: What if you can only connect a SUBSET of “important” nodes (others are optional intermediates)? A2: That’s the Steiner Tree problem — NP-hard. Heuristics: minimum spanning tree on the important nodes (often a 2-approx).
Q3: What if edges are added DYNAMICALLY over time and you want the MST at each timestep? A3: Link-cut trees give O(log n) per edge insert/delete (Sleator-Tarjan).
Q4: Tie-breaking when multiple MSTs exist? A4: Any spanning tree of minimum weight is valid. To make it unique, perturb weights with small epsilons or break ties lex.
Q5: How does this generalize to k-edge-connected spanning subgraphs? A5: For k=2 (bridge-connected), polynomial via matroid intersection; for general k, harder. Standard MST is the k=1 case.
14. Solution Walkthrough
See solution.py:
min_cost_connect_prim_array— O(V²) Prim withmin_costarray (canonical).min_cost_connect_kruskal— sort all edges, union-find.min_cost_connect_brute— try every spanning tree (Cayley enumeration via Prüfer). For stress; works only n ≤ 6.
15. Beyond the Problem
Principal code review: “MST is the prototypical greedy-is-optimal proof in algorithms textbooks. The cut property tells you that local cheapest choices, made carefully, compose into a global optimum — but the carefulness matters: you must choose edges crossing some cut, not arbitrary edges. Kruskal exploits this via global edge-sorting; Prim via expanding-front. Borůvka via parallel cheapest-edge-per-component. Three different orchestrations of the same greedy idea. In production network design at AWS/GCP scale, MST is the seed; real systems layer on redundancy (k-edge-connected), latency constraints, and multi-criteria optimization, eventually becoming integer programs. But the MST intuition — ‘spanning tree’ as the cheapest ‘just-barely-connected’ state — survives. Knowing when MST is enough vs when you need more machinery is the senior+ judgment call.”