Lab 08 — MST via Kruskal (Min Cost to Connect All Points)

Goal

Build a minimum spanning tree (MST) on a complete graph derived from N points using Kruskal’s algorithm. After this lab you should be able to recognize the MST signal in <60 seconds, write Kruskal from a blank screen (sort + DSU + early-exit) in <8 minutes, and reason about when Kruskal beats Prim (sparse graphs, edge list already given) and vice versa (dense graphs, adjacency matrix).

Background Concepts

A spanning tree of a connected graph G is a subgraph that includes all V vertices and exactly V - 1 edges with no cycles. The MST is the spanning tree with minimum total edge weight. Two canonical algorithms:

Kruskal’s: Sort all edges by weight ascending. Iterate; for each edge, union the endpoints if they’re in different components (use DSU); add to MST. Stop after V - 1 edges. Time O(E log E).
Prim’s: Start from any vertex; maintain a min-heap of crossing edges; repeatedly extract the lightest edge to a new vertex. Time O(E log V) with a binary heap; O(E + V log V) with a Fibonacci heap.

For “connect all points” with edge weights = pairwise Manhattan distance, the graph is complete: E = V·(V-1)/2 ≈ V². At V = 1000, E ≈ 5 × 10^5 edges. Either algorithm works; Kruskal with DSU is the cleanest expression because we already have the edge list.

Interview Context

Min Cost to Connect All Points (LC 1584) is a Medium asked at Amazon, Bloomberg, and Salesforce. It’s a clean MST signal: “minimum total cost to make everything connected.” The senior signal is naming the problem ("this is MST on a complete graph") within 60 seconds, then choosing Kruskal vs Prim consciously based on density. Strong candidates also state the Cut Property as the correctness foundation.

Problem Statement

Given an array points where points[i] = [xi, yi] represents a point in 2D, the cost of connecting two points is the Manhattan distance between them: |xi - xj| + |yi - yj|. Return the minimum cost to connect all points, where any two points are connected if there is a path between them.

Constraints

1 ≤ |points| ≤ 1000
−10^6 ≤ xi, yi ≤ 10^6
All points are distinct.

Clarifying Questions

Manhattan, Euclidean, or other metric? (Manhattan, per problem.)
Are diagonal connections counted? (Implicitly yes — we connect any two points directly.)
Are points distinct? (Yes, per constraint.)
Single point — cost? (0, no edges needed.)
Should the answer fit in 32-bit? (Max cost ≈ 999 · 4 × 10^6 ≈ 4 × 10^9 — use 64-bit just in case, though Python int is unbounded.)

Examples

points = [[0,0],[2,2],[3,10],[5,2],[7,0]]
→ 20

points = [[3,12],[-2,5],[-4,1]]
→ 18

points = [[0,0]]
→ 0

Initial Brute Force

Enumerate all spanning trees and pick the one with minimum total weight. There are exponentially many. Infeasible.

A second “brute force” is Prim’s via array-scan (no heap): O(V²). At V = 1000: 10^6 ops — passes easily and is simpler than the heap version. This is actually a competitive option for this problem.

Brute Force Complexity

Spanning-tree enumeration: O(V^(V-2)) by Cayley’s formula. Infeasible. Array-scan Prim: O(V²). At V = 1000: 10^6 ops, well within budget.

Optimization Path

For dense graphs (E ~ V²), array-scan Prim is O(V²) — wins over Kruskal’s O(E log E) = O(V² log V). For sparse graphs, Kruskal or heap-Prim wins.

For LC 1584 specifically (V = 1000, dense), all three pass:

Kruskal: O(V² log V²) = O(V² log V) = ~10^7 ops; passes in 1-2s.
Heap-Prim: O(V² log V); same.
Array-scan Prim: O(V²); fastest.

In interviews, Kruskal is the “safer” choice because the code is mechanical: edges → sort → DSU → loop. Show you can choose array-scan Prim when asked about dense graphs.

Final Expected Approach (Kruskal)

Generate all V·(V-1)/2 edges with weight = Manhattan distance.
Sort by weight ascending.
Initialize DSU with V components.
Iterate edges; if endpoints differ, union and add weight to total; count edges added.
Stop when edges added == V - 1 (early exit).

Data Structures Used

List of edges as (weight, u, v) tuples.
DSU as in Lab 07 (path compression + union by rank).
Integer accumulator for total cost.

Correctness Argument

Cut Property: For any cut (partition of vertices into two non-empty sets), the minimum-weight edge crossing the cut belongs to some MST. Kruskal greedily picks the lightest edge that doesn’t create a cycle (i.e., the lightest edge crossing some cut between two components); by the Cut Property, this edge is safe — there is an MST containing it. Repeating this V - 1 times produces an MST.

No-cycle invariant: DSU’s find ensures we add an edge only when its endpoints are in different components. Since adding an edge between same-component endpoints creates a cycle, this is exactly the cycle-prevention check.

Termination: Each union reduces component count by 1; after V - 1 unions, the graph is connected. We stop early.

Complexity

Operation	Time	Space
Edge generation	O(V²)	O(V²)
Sort	O(V² log V)	(in-place possible)
DSU loop	O(V² · α)	O(V)
Total	O(V² log V)	O(V²)

Implementation Requirements

class DSU:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
    def find(self, x):
        while self.parent[x] != x:
            self.parent[x] = self.parent[self.parent[x]]  # path compression by halving
            x = self.parent[x]
        return x
    def union(self, x, y):
        rx, ry = self.find(x), self.find(y)
        if rx == ry: return False
        if self.rank[rx] < self.rank[ry]: rx, ry = ry, rx
        self.parent[ry] = rx
        if self.rank[rx] == self.rank[ry]: self.rank[rx] += 1
        return True

def minCostConnectPoints(points):
    n = len(points)
    edges = []
    for i in range(n):
        xi, yi = points[i]
        for j in range(i + 1, n):
            xj, yj = points[j]
            edges.append((abs(xi - xj) + abs(yi - yj), i, j))
    edges.sort()
    dsu = DSU(n)
    total, count = 0, 0
    for w, u, v in edges:
        if dsu.union(u, v):
            total += w
            count += 1
            if count == n - 1: break
    return total

Tests

Standard: 5 points → 20.
Single point: 0.
Two points: their Manhattan distance.
Collinear points (all on x-axis): MST weight = max(x) - min(x).
Stress: V = 1000 random — verify Kruskal and Prim agree.
Adversarial: points on a grid — many edges of equal weight (test tie-breaking is stable).

Follow-up Questions

“Connecting Cities With Minimum Cost (LC 1135)” → Same MST template; if multiple MSTs valid, return any total cost; -1 if disconnected.
“Optimize Water Distribution in a Village (LC 1168)” → Add a virtual node 0 connected to each house with the well-cost; run MST on V + 1 nodes.
“Critical Connections (bridges)” → Different problem (Tarjan’s bridge algorithm), see Phase README.
“Maximum Spanning Tree.” → Sort descending; same algorithm.
“Online edge insertion: maintain MST.” → Link-Cut Trees; advanced.

Product Extension

Network design (laying fiber, planning power grids), clustering algorithms (single-linkage clustering = MST followed by cut-largest-edges), image segmentation, and approximation algorithms for TSP all use MST as a primitive. AWS / Azure data-center backbone planning uses MST variants weighted by latency × cost.

Language/Runtime Follow-ups

Python: edges.sort() on tuples sorts lexicographically — (weight, u, v) works. heapq is overkill since we need all edges sorted upfront, not on-demand.
Java: Arrays.sort(int[][]) with a comparator on the weight column. Use int[] triples for cache locality.
Go: sort.Slice(edges, func(i, j int) bool { return edges[i].w < edges[j].w }).
C++: vector<tuple<int,int,int>> with default < ordering; sort(edges.begin(), edges.end()).
JS/TS: edges.sort((a, b) => a[0] - b[0]). Avoid a[0] < b[0] returning a boolean (subtle bug).

Common Bugs

Forgetting the early exit when edges added = V - 1 — works but processes more edges than needed.
Generating duplicate edges (i.e., (i, j) and (j, i)) — wastes time but doesn’t break correctness.
Off-by-one in V - 1 — counting edges incorrectly leads to incomplete tree.
Comparator returns boolean in JS — use subtraction.
Integer overflow on edge weights — Manhattan distance bounded by 4 × 10^6; sum bounded by ~4 × 10^9, fits in 64-bit. In Python no issue; in Java use long.
DSU’s union returns nothing vs returns success boolean — pick a consistent API.
Returning the count of edges instead of total weight — silent.

Debugging Strategy

For small inputs, print the sorted edge list and the DSU state after each union. Verify the total edges added is exactly V - 1. If the result is too large, check that you’re not adding edges within the same component (the cycle check might be skipped). Compare against a Prim’s reference for stress tests.

Mastery Criteria

Recognized “minimum cost to connect” as MST in <60 seconds.
Wrote Kruskal from blank screen in <8 minutes.
Chose Kruskal vs Prim consciously based on density when asked.
Stated the Cut Property as the correctness foundation in <30 seconds.
Solved LC 1584 in <20 minutes from cold start.
Solved LC 1135 (Connecting Cities) in <12 minutes by extending the template.
Solved LC 1168 (Water Distribution) in <15 minutes with the virtual-node trick.
Stated the V² Prim option for dense graphs in <30 seconds.

LeetCode Interview — Extreme Coding