p66 — Min Cost to Connect All Points

Source: LeetCode 1584 · Medium · Topics: MST, Kruskal, Prim, Union-Find Companies (2024–2025): Amazon (high), Google (high), Microsoft (medium-high), LinkedIn (medium) Loop position: MST canonical. Tests the two classical MST algorithms (Kruskal + Prim) and your judgment on which to pick given density. Often phrased as “connect houses with minimum cable” or “minimum-cost network.”

1. Quick Context

n points in 2D. Cost to connect points i, j = Manhattan distance |xi−xj| + |yi−yj|. Return min total cost to connect all points (so the graph is connected — equivalently, build a spanning tree of min weight).

This is the MST problem on a complete graph with V² edges.

Prim (better here — graph is dense, V² edges):
in_mst = [False] * n
min_cost = [INF] * n
min_cost[0] = 0
total = 0
for _ in range(n):
    u = argmin over not-in-mst of min_cost
    in_mst[u] = True
    total += min_cost[u]
    for v in range(n):
        if not in_mst[v]:
            min_cost[v] = min(min_cost[v], dist(u, v))
return total

O(V²) — perfect for V² edges. Heap-based Prim is O(E log V), Kruskal is O(E log E). With V² edges all are similar; array-Prim wins by constant.

2. LeetCode Link + Attempt Gate

🔗 https://leetcode.com/problems/min-cost-to-connect-all-points/

STOP. 30-min timer.

3. Prerequisite Concepts

Spanning tree definition: connected, acyclic, V-1 edges, spans all V nodes.
MST: spanning tree of minimum total edge weight (unique if all weights distinct).
Cut property: lightest edge crossing any cut belongs to some MST.
Cycle property: heaviest edge in any cycle is NOT in any MST.
Union-Find (p65) for Kruskal; heap or array for Prim.

4. How to Approach (FRAMEWORK 1–9)

1. Restate: MST on the complete graph induced by Manhattan distances.

2. Clarify: Distance is Manhattan (problem-stated). All points distinct? (Often yes; even if not, MST is well-defined.) Allowed duplicates? Same answer.

3. Examples: [[0,0],[2,2],[3,10],[5,2],[7,0]] → 20. Construct by hand: edges (0,0)→(2,2)=4, (2,2)→(5,2)=3, (5,2)→(7,0)=4, (5,2)→(3,10)=9. Total 20.

5. Brute force: Enumerate all spanning trees (n^(n-2) by Cayley’s formula). Infeasible.

6. Target: O(V²) via Prim with array (since edge count = V²).

7. Pattern: MST. Choose Prim (array) for dense graphs, Kruskal for sparse.

8. Develop: see solution.py.

9. Test: n=1 → 0; n=2 → distance between them; collinear points; coincident points.

5. Progressive Hints

See hints.md.

6. Deeper Insight

6.1 Why Prim wins for this problem

Edge count = V·(V-1)/2 ≈ V². Kruskal is O(E log E) = O(V² log V). Prim-with-heap is O((V + E) log V) = O(V² log V). Prim-with-array is O(V²) — strictly better when E ≈ V². On LC’s max constraints (V ≤ 1000), V² = 10⁶ vs V² log V ≈ 2·10⁷. The constant matters.

For SPARSE graphs (E = O(V)), Kruskal or heap-Prim win.

6.2 Prim’s correctness — the cut property

Cut property: For any cut (partition of vertices into S and V\S), the cheapest edge crossing the cut belongs to some MST.

Prim maintains S = set of nodes already in MST. At each step, it adds the cheapest edge from S to V\S. By the cut property, this edge is safe. After V−1 additions, S = V; we have an MST.

6.3 Kruskal’s correctness — also cut property, but globally

Algorithm: Sort edges by weight. For each in order, if its endpoints are in different DSU components, add to MST and union. Skip otherwise.

Why it works: When you consider edge e = (u, v), the partition (component of u) | rest defines a cut. Since e is the next-smallest edge and you haven’t yet connected this cut, e is the cheapest edge crossing it — safe by cut property.

6.4 Array-Prim implementation detail

min_cost[v] = “weight of cheapest edge from {already-in-MST} to v.” Initialize min_cost[0] = 0, all others INF. In each iteration:

Pick u = unfinished node with smallest min_cost[u].
Add min_cost[u] to total.
Mark u finished.
For each unfinished v, update min_cost[v] = min(min_cost[v], dist(u, v)).

Repeats V times; inner loops O(V). Total O(V²).

6.5 Heap-Prim subtlety

Heap-Prim pushes (weight, v) for each newly-improvable neighbor; doesn’t decrease-key (Python heapq lacks it). May push duplicates; on pop, skip if v is finished. Time: O(E log V) amortized. Cleaner code but slower constants here.

6.6 Manhattan-distance MST in O(V log V)

For Manhattan-distance MST in 2D, there’s a clever O(V log V) algorithm using sweep + KD-tree. Each point has at most 8 “candidate” neighbors in the MST (one per octant). Overkill for LC; mention only at Staff+. The key insight: in L1, the MST is sparse despite the complete underlying graph.

6.7 Variants and friends

LC 1135 Connecting Cities with Minimum Cost: explicit edges given → Kruskal.
LC 1631 Path with Minimum Effort: minimax path = MST or binary search.
LC 1168 Optimize Water Distribution: add a super-source for “well” costs, then MST.
Steiner Tree: MST when only a SUBSET of nodes must be connected. NP-hard.

6.8 Borůvka’s algorithm (third MST algorithm)

Each iteration: every component picks its cheapest outgoing edge; add all unique ones. Halves component count per round → O(E log V). Parallelizable. Original 1926 algorithm. Worth knowing the name at Staff+.

7. Anti-Pattern Analysis

Build explicit edge list of V² edges in Python list of tuples for Kruskal — uses ~10⁸ memory for V=10⁴; OOM. Prefer array-Prim.
Heap-Prim without staleness check — popping a stale (w, v) and re-processing v double-counts.
Forget dist(u, u) = 0 in array-Prim — handled automatically if you initialize min_cost[0] = 0 and add it; common bug is to start total at dist(0, ?) instead of 0.
Use Euclidean distance (square root) — problem says Manhattan; passing tests on a different metric is silently wrong.
Recompute distance on every loop iteration — fine for clarity at V≤1000; precomputing matrix uses V² extra memory but trades for speed.
Mix up MST with shortest path — they’re different problems! MST minimizes total tree weight; SSSP minimizes per-vertex distance from a root.

8. Skills & Takeaways

Two algorithms (Kruskal, Prim), one MST.
Density determines which is faster: dense → array-Prim, sparse → Kruskal.
Cut property as the core correctness lemma for both.
Build complete graphs implicitly (compute distances on the fly) rather than materializing edge lists when V² is large.

9. When to Move On

Implement array-Prim and Kruskal+DSU cold in 12 min each.
State the cut property and its role in both proofs in 60 seconds.
Solve LC 1584, LC 1135, LC 1168, LC 1631 in 25 min each.
Pick the right algorithm given density without prompting.

10. Company Context

Amazon: L5/L6; “warehouse network design” / “AWS region interconnect cost.”
Google: L5/L6; data-center cabling, fiber-laying problems.
Microsoft: L65+; Azure network topology.
LinkedIn: L5; cluster formation in connectivity graphs.

Scorecard for strong: “Recognized MST, picked Prim for density reasons, cited cut property, implemented cleanly, mentioned Kruskal as alternative + Steiner tree if subset constraint added.”

11. Interviewer’s Lens

Signal	Junior	Mid	Senior	Staff+
Recognize MST	“It’s connecting things”	Names MST	+ picks algorithm by density	+ Borůvka, Steiner, Manhattan MST O(V log V)
Correctness	Hand-wavy	Mentions greedy	Cites cut property	+ sketches proof + cycle property
Implementation	Buggy	One works	Both clean	+ array-Prim O(V²) optimization
Variants	None	LC 1135	+ LC 1168 super-source trick	+ Steiner tree NP-hardness + LP relaxation

12. Level Delta

Mid (L4): One MST algorithm correct (Prim or Kruskal); recognizes the problem.
Senior (L5): + both algorithms + density justifies choice + LC 1168 super-source pattern.
Staff (L6): + Borůvka’s algorithm + Steiner tree NP-hardness + Manhattan MST O(V log V) sketch + LC 1631 minimax connection.
Principal (L7+): + LP relaxation framing of MST + production network-design pipelines + multi-criteria MST (cost + latency) + distributed MST (GHS algorithm 1983).

13. Follow-up Q&A

Q1: What if some edges are mandatory (must be in the tree)? A1: Contract them first (union endpoints in DSU; cost += sum). Then run MST on the remainder.

Q2: What if you can only connect a SUBSET of “important” nodes (others are optional intermediates)? A2: That’s the Steiner Tree problem — NP-hard. Heuristics: minimum spanning tree on the important nodes (often a 2-approx).

Q3: What if edges are added DYNAMICALLY over time and you want the MST at each timestep? A3: Link-cut trees give O(log n) per edge insert/delete (Sleator-Tarjan).

Q4: Tie-breaking when multiple MSTs exist? A4: Any spanning tree of minimum weight is valid. To make it unique, perturb weights with small epsilons or break ties lex.

Q5: How does this generalize to k-edge-connected spanning subgraphs? A5: For k=2 (bridge-connected), polynomial via matroid intersection; for general k, harder. Standard MST is the k=1 case.

14. Solution Walkthrough

See solution.py:

min_cost_connect_prim_array — O(V²) Prim with min_cost array (canonical).
min_cost_connect_kruskal — sort all edges, union-find.
min_cost_connect_brute — try every spanning tree (Cayley enumeration via Prüfer). For stress; works only n ≤ 6.

15. Beyond the Problem

Principal code review: “MST is the prototypical greedy-is-optimal proof in algorithms textbooks. The cut property tells you that local cheapest choices, made carefully, compose into a global optimum — but the carefulness matters: you must choose edges crossing some cut, not arbitrary edges. Kruskal exploits this via global edge-sorting; Prim via expanding-front. Borůvka via parallel cheapest-edge-per-component. Three different orchestrations of the same greedy idea. In production network design at AWS/GCP scale, MST is the seed; real systems layer on redundancy (k-edge-connected), latency constraints, and multi-criteria optimization, eventually becoming integer programs. But the MST intuition — ‘spanning tree’ as the cheapest ‘just-barely-connected’ state — survives. Knowing when MST is enough vs when you need more machinery is the senior+ judgment call.”

LeetCode Interview — Extreme Coding