p66 — Min Cost to Connect All Points

Source: LeetCode 1584 · Medium · Topics: MST, Kruskal, Prim, Union-Find Companies (2024–2025): Amazon (high), Google (high), Microsoft (medium-high), LinkedIn (medium) Loop position: MST canonical. Tests the two classical MST algorithms (Kruskal + Prim) and your judgment on which to pick given density. Often phrased as “connect houses with minimum cable” or “minimum-cost network.”

1. Quick Context

n points in 2D. Cost to connect points i, j = Manhattan distance |xi−xj| + |yi−yj|. Return min total cost to connect all points (so the graph is connected — equivalently, build a spanning tree of min weight).

This is the MST problem on a complete graph with V² edges.

Prim (better here — graph is dense, V² edges):
in_mst = [False] * n
min_cost = [INF] * n
min_cost[0] = 0
total = 0
for _ in range(n):
    u = argmin over not-in-mst of min_cost
    in_mst[u] = True
    total += min_cost[u]
    for v in range(n):
        if not in_mst[v]:
            min_cost[v] = min(min_cost[v], dist(u, v))
return total

O(V²) — perfect for V² edges. Heap-based Prim is O(E log V), Kruskal is O(E log E). With V² edges all are similar; array-Prim wins by constant.

🔗 https://leetcode.com/problems/min-cost-to-connect-all-points/

STOP. 30-min timer.

3. Prerequisite Concepts

  • Spanning tree definition: connected, acyclic, V-1 edges, spans all V nodes.
  • MST: spanning tree of minimum total edge weight (unique if all weights distinct).
  • Cut property: lightest edge crossing any cut belongs to some MST.
  • Cycle property: heaviest edge in any cycle is NOT in any MST.
  • Union-Find (p65) for Kruskal; heap or array for Prim.

4. How to Approach (FRAMEWORK 1–9)

1. Restate: MST on the complete graph induced by Manhattan distances.

2. Clarify: Distance is Manhattan (problem-stated). All points distinct? (Often yes; even if not, MST is well-defined.) Allowed duplicates? Same answer.

3. Examples: [[0,0],[2,2],[3,10],[5,2],[7,0]] → 20. Construct by hand: edges (0,0)→(2,2)=4, (2,2)→(5,2)=3, (5,2)→(7,0)=4, (5,2)→(3,10)=9. Total 20.

5. Brute force: Enumerate all spanning trees (n^(n-2) by Cayley’s formula). Infeasible.

6. Target: O(V²) via Prim with array (since edge count = V²).

7. Pattern: MST. Choose Prim (array) for dense graphs, Kruskal for sparse.

8. Develop: see solution.py.

9. Test: n=1 → 0; n=2 → distance between them; collinear points; coincident points.

5. Progressive Hints

See hints.md.

6. Deeper Insight

6.1 Why Prim wins for this problem

Edge count = V·(V-1)/2 ≈ V². Kruskal is O(E log E) = O(V² log V). Prim-with-heap is O((V + E) log V) = O(V² log V). Prim-with-array is O(V²) — strictly better when E ≈ V². On LC’s max constraints (V ≤ 1000), V² = 10⁶ vs V² log V ≈ 2·10⁷. The constant matters.

For SPARSE graphs (E = O(V)), Kruskal or heap-Prim win.

6.2 Prim’s correctness — the cut property

Cut property: For any cut (partition of vertices into S and V\S), the cheapest edge crossing the cut belongs to some MST.

Prim maintains S = set of nodes already in MST. At each step, it adds the cheapest edge from S to V\S. By the cut property, this edge is safe. After V−1 additions, S = V; we have an MST.

6.3 Kruskal’s correctness — also cut property, but globally

Algorithm: Sort edges by weight. For each in order, if its endpoints are in different DSU components, add to MST and union. Skip otherwise.

Why it works: When you consider edge e = (u, v), the partition (component of u) | rest defines a cut. Since e is the next-smallest edge and you haven’t yet connected this cut, e is the cheapest edge crossing it — safe by cut property.

6.4 Array-Prim implementation detail

min_cost[v] = “weight of cheapest edge from {already-in-MST} to v.” Initialize min_cost[0] = 0, all others INF. In each iteration:

  1. Pick u = unfinished node with smallest min_cost[u].
  2. Add min_cost[u] to total.
  3. Mark u finished.
  4. For each unfinished v, update min_cost[v] = min(min_cost[v], dist(u, v)).

Repeats V times; inner loops O(V). Total O(V²).

6.5 Heap-Prim subtlety

Heap-Prim pushes (weight, v) for each newly-improvable neighbor; doesn’t decrease-key (Python heapq lacks it). May push duplicates; on pop, skip if v is finished. Time: O(E log V) amortized. Cleaner code but slower constants here.

6.6 Manhattan-distance MST in O(V log V)

For Manhattan-distance MST in 2D, there’s a clever O(V log V) algorithm using sweep + KD-tree. Each point has at most 8 “candidate” neighbors in the MST (one per octant). Overkill for LC; mention only at Staff+. The key insight: in L1, the MST is sparse despite the complete underlying graph.

6.7 Variants and friends

  • LC 1135 Connecting Cities with Minimum Cost: explicit edges given → Kruskal.
  • LC 1631 Path with Minimum Effort: minimax path = MST or binary search.
  • LC 1168 Optimize Water Distribution: add a super-source for “well” costs, then MST.
  • Steiner Tree: MST when only a SUBSET of nodes must be connected. NP-hard.

6.8 Borůvka’s algorithm (third MST algorithm)

Each iteration: every component picks its cheapest outgoing edge; add all unique ones. Halves component count per round → O(E log V). Parallelizable. Original 1926 algorithm. Worth knowing the name at Staff+.

7. Anti-Pattern Analysis

  1. Build explicit edge list of V² edges in Python list of tuples for Kruskal — uses ~10⁸ memory for V=10⁴; OOM. Prefer array-Prim.
  2. Heap-Prim without staleness check — popping a stale (w, v) and re-processing v double-counts.
  3. Forget dist(u, u) = 0 in array-Prim — handled automatically if you initialize min_cost[0] = 0 and add it; common bug is to start total at dist(0, ?) instead of 0.
  4. Use Euclidean distance (square root) — problem says Manhattan; passing tests on a different metric is silently wrong.
  5. Recompute distance on every loop iteration — fine for clarity at V≤1000; precomputing matrix uses V² extra memory but trades for speed.
  6. Mix up MST with shortest path — they’re different problems! MST minimizes total tree weight; SSSP minimizes per-vertex distance from a root.

8. Skills & Takeaways

  • Two algorithms (Kruskal, Prim), one MST.
  • Density determines which is faster: dense → array-Prim, sparse → Kruskal.
  • Cut property as the core correctness lemma for both.
  • Build complete graphs implicitly (compute distances on the fly) rather than materializing edge lists when V² is large.

9. When to Move On

  • Implement array-Prim and Kruskal+DSU cold in 12 min each.
  • State the cut property and its role in both proofs in 60 seconds.
  • Solve LC 1584, LC 1135, LC 1168, LC 1631 in 25 min each.
  • Pick the right algorithm given density without prompting.

10. Company Context

  • Amazon: L5/L6; “warehouse network design” / “AWS region interconnect cost.”
  • Google: L5/L6; data-center cabling, fiber-laying problems.
  • Microsoft: L65+; Azure network topology.
  • LinkedIn: L5; cluster formation in connectivity graphs.

Scorecard for strong: “Recognized MST, picked Prim for density reasons, cited cut property, implemented cleanly, mentioned Kruskal as alternative + Steiner tree if subset constraint added.”

11. Interviewer’s Lens

SignalJuniorMidSeniorStaff+
Recognize MST“It’s connecting things”Names MST+ picks algorithm by density+ Borůvka, Steiner, Manhattan MST O(V log V)
CorrectnessHand-wavyMentions greedyCites cut property+ sketches proof + cycle property
ImplementationBuggyOne worksBoth clean+ array-Prim O(V²) optimization
VariantsNoneLC 1135+ LC 1168 super-source trick+ Steiner tree NP-hardness + LP relaxation

12. Level Delta

  • Mid (L4): One MST algorithm correct (Prim or Kruskal); recognizes the problem.
  • Senior (L5): + both algorithms + density justifies choice + LC 1168 super-source pattern.
  • Staff (L6): + Borůvka’s algorithm + Steiner tree NP-hardness + Manhattan MST O(V log V) sketch + LC 1631 minimax connection.
  • Principal (L7+): + LP relaxation framing of MST + production network-design pipelines + multi-criteria MST (cost + latency) + distributed MST (GHS algorithm 1983).

13. Follow-up Q&A

Q1: What if some edges are mandatory (must be in the tree)? A1: Contract them first (union endpoints in DSU; cost += sum). Then run MST on the remainder.

Q2: What if you can only connect a SUBSET of “important” nodes (others are optional intermediates)? A2: That’s the Steiner Tree problem — NP-hard. Heuristics: minimum spanning tree on the important nodes (often a 2-approx).

Q3: What if edges are added DYNAMICALLY over time and you want the MST at each timestep? A3: Link-cut trees give O(log n) per edge insert/delete (Sleator-Tarjan).

Q4: Tie-breaking when multiple MSTs exist? A4: Any spanning tree of minimum weight is valid. To make it unique, perturb weights with small epsilons or break ties lex.

Q5: How does this generalize to k-edge-connected spanning subgraphs? A5: For k=2 (bridge-connected), polynomial via matroid intersection; for general k, harder. Standard MST is the k=1 case.

14. Solution Walkthrough

See solution.py:

  • min_cost_connect_prim_array — O(V²) Prim with min_cost array (canonical).
  • min_cost_connect_kruskal — sort all edges, union-find.
  • min_cost_connect_brute — try every spanning tree (Cayley enumeration via Prüfer). For stress; works only n ≤ 6.

15. Beyond the Problem

Principal code review: “MST is the prototypical greedy-is-optimal proof in algorithms textbooks. The cut property tells you that local cheapest choices, made carefully, compose into a global optimum — but the carefulness matters: you must choose edges crossing some cut, not arbitrary edges. Kruskal exploits this via global edge-sorting; Prim via expanding-front. Borůvka via parallel cheapest-edge-per-component. Three different orchestrations of the same greedy idea. In production network design at AWS/GCP scale, MST is the seed; real systems layer on redundancy (k-edge-connected), latency constraints, and multi-criteria optimization, eventually becoming integer programs. But the MST intuition — ‘spanning tree’ as the cheapest ‘just-barely-connected’ state — survives. Knowing when MST is enough vs when you need more machinery is the senior+ judgment call.”