Lab 08 — Bitmask Dynamic Programming

Goal

Solve LC 847 — Shortest Path Visiting All Nodes — using bitmask DP over (visited-set, current-node) state. Internalize the recipe: when N ≤ 20-ish and the state involves “subset of which items have been used / visited / assigned”, a bitmask is the state and the transition is a bit OR.

Background Concepts

A bitmask is an integer interpreted as a set: bit k is 1 iff element k is in the set. Set operations:

Union: a | b. Intersection: a & b. Difference: a & ~b. Symmetric difference: a ^ b.
Add element k: a | (1 << k). Remove: a & ~(1 << k). Test: (a >> k) & 1 or a & (1 << k).
Iterate all subsets of mask: s = mask; while s > 0: ...; s = (s - 1) & mask. Iterates 2^popcount(mask) subsets.
Iterate set bits: while mask: k = (mask & -mask).bit_length() - 1; mask &= mask - 1.

Bitmask DP stores dp[mask][...] indexed by the subset. Useful when N ≤ ~20 (so 2^N ≤ 10⁶) and the state must remember “exactly which subset has been processed”. It generalizes:

TSP-like: dp[mask][i] = min cost path that visited exactly mask, ending at i.
Subset-cover: dp[mask] = min cost to cover mask`` summed over groups.
Assignment problem: dp[mask] = min cost to assign first popcount(mask) people to the jobs in mask``.

For LC 847, we want shortest walk (edges may repeat) visiting all nodes. The state (mask, i) captures “I’ve visited the set mask of nodes (at least once) and I’m currently at node i”. Transitions: from (mask, i), move to any neighbor j, new state (mask | (1 << j), j), cost +1. We want the smallest distance to any state (full_mask, *). BFS suffices since edge cost is 1.

Interview Context

Bitmask DP is a 1-2% problem family but a strong-hire signal when recognized fast. The trigger: N ≤ 20 with a constraint involving subsets. Asked at: Google occasionally, Stripe / Two Sigma, Meta in bar-raiser slot. Common trap is recognizing 2^N · N is feasible at N=15 (~5 × 10⁵) but not at N=25 (~8 × 10⁸).

Problem Statement (LC 847)

Given an undirected, connected graph of N nodes labeled 0..N-1 as adjacency lists, return the length of the shortest path that visits every node. You may start and stop at any node, may revisit nodes and edges.

Constraints

1 ≤ N ≤ 12
0 ≤ graph[i].length < N
Graph is connected.

Clarifying Questions

Length = number of edges (not nodes)? (Yes — number of edges traversed.)
Are self-loops possible? (No.)
May the path start and end at different nodes? (Yes.)
Is the graph guaranteed connected? (Yes — answer always finite.)

Examples

graph = [[1, 2, 3], [0], [0], [0]]
       (star with center 0, leaves 1, 2, 3)
shortest path visiting all = 4 (e.g., 1 → 0 → 2 → 0 → 3)

graph = [[1], [0, 2, 4], [1, 3, 4], [2], [1, 2]]
shortest = 4

Initial Brute Force

DFS / backtracking from every starting node, exploring all walks up to some bounded length. Without memoization, walks can be exponential in length even for small graphs. A timeout and a hand-tuned bound make this brittle.

Brute Force Complexity

Unbounded (or exponential with bound). Practically TLE for any non-trivial test.

Optimization Path

The state space is (mask, current_node) with mask ∈ [0, 2^N) and current_node ∈ [0, N). Total states: N · 2^N. For N=12: 12 · 4096 = 49152. Each state has ≤ N-1 outgoing transitions; total edges: N² · 2^N ≈ 6 × 10⁵. Trivially feasible.

Since edge weights are 1, BFS over the state graph from all starting states {(1 << i, i) : i ∈ [0, N)} (all “I’ve visited just myself” states) gives the shortest distance to each state. Stop when we dequeue a state with mask = (1 << N) - 1.

Final Expected Approach

full = (1 << N) - 1.
Initialize a queue with all (mask=1<<i, node=i) for i ∈ [0, N).
seen[(mask, node)] initialized for those starts.
BFS: pop (mask, u); for each neighbor v, new_mask = mask | (1 << v); if (new_mask, v) unseen, enqueue with dist+1.
First time a state with mask == full is dequeued, return its distance.

Data Structures Used

deque for BFS frontier.
2D seen of shape [2^N][N] (boolean) or a set.
Distance tracked alongside state in the queue (dist+1 per step).

Correctness Argument

The state graph is a directed graph on N · 2^N states; an edge (mask, u) → (mask | (1 << v), v) exists iff v is a graph neighbor of u. A walk in the original graph that visits all N nodes corresponds to a path in the state graph from some (1<<i, i) to some (full, j). Edge costs are 1 (one edge traversed per state-graph edge). Therefore shortest walk = shortest path in state graph from the start set to any final state, computed by multi-source BFS.

BFS visits each state once and terminates at the first finalized state. Correct because all edge weights equal.

Complexity

Quantity	Value
States	N · 2^N
Transitions	up to N² · 2^N
Time	O(N² · 2^N)
Space	O(N · 2^N)

At N=12: ~6 × 10⁵ ops. Fast.

Implementation Requirements

from collections import deque

def shortestPathLength(graph):
    n = len(graph)
    if n == 1: return 0
    full = (1 << n) - 1
    # State: (mask, node). Distance tracked by BFS layer.
    seen = [[False] * n for _ in range(1 << n)]
    q = deque()
    for i in range(n):
        seen[1 << i][i] = True
        q.append((1 << i, i, 0))
    while q:
        mask, u, d = q.popleft()
        if mask == full:
            return d
        for v in graph[u]:
            new_mask = mask | (1 << v)
            if not seen[new_mask][v]:
                seen[new_mask][v] = True
                q.append((new_mask, v, d + 1))
    return -1  # unreachable for connected graphs

Tests

N=1: return 0.
N=2 with one edge: return 1.
Star example: 4.
Linear chain 0-1-2-3-4: shortest visiting all = 4.
Complete graph K_5: shortest = 4 (any Hamiltonian path).
Disconnected (constraint says connected, but defensive): return -1 / handle.
Stress: random connected graphs N=8..12 vs Held-Karp O(N² · 2^N) DP for cross-check.

Follow-up Questions

“What if edges have weights?” → Dijkstra instead of BFS; same state graph.
“What if I must start and end at node 0 (TSP)?” → state (mask, i) with cost dp[mask][i] = min cost, recurrence dp[mask | (1 << j)][j] = min(dp[mask][i] + w(i, j)). Answer: min(dp[full][i] + w(i, 0)).
“What if N=20?” → 20 · 2^20 = 2 × 10⁷ states, still ok. At N=25 we hit 8 × 10⁸ — likely TLE. The constraint cap on bitmask DP is N ~ 22.
“Subset-cover variant (LC 1125 — Smallest Sufficient Team).” → dp[mask] = min team to cover skill-mask mask; transition: for each person p with skill-mask pm, dp[mask | pm] = min(dp[mask] + 1).
“Assignment problem in bitmask DP.” → dp[mask] = min cost to assign popcount(mask) people to the jobs in mask; transition over which job person popcount(mask) takes.

Product Extension

Vehicle routing / drone delivery with ≤ 20 stops: bitmask DP precomputes optimal tours offline. Interview-scheduling problems (LC 1066): assign N workers to N tasks minimizing cost; the assignment-DP variant of bitmask DP runs in O(N · 2^N), beating the O(N · N!) brute force at N=15 by 9 orders of magnitude.

Language/Runtime Follow-ups

Python: BFS with deque; the inner loop can be slow at N=12. PyPy if benchmarking. For N>14, switch to numpy or pre-flatten the seen array to bytearray.
Java: boolean[][] seen = new boolean[1 << n][n]. The deque ArrayDeque<int[]> boxes each state; for performance pack (mask, node, dist) into a long.
Go: idiomatic [][]bool. Use slice queue with head/tail indices to avoid alloc churn.
C++: vector<vector<bool>> seen(1 << n, vector<bool>(n)). Pack state into int (mask * n + node) and use vector<bool> of size n * (1 << n) for cache.
JS/TS: Uint8Array(n * (1 << n)) for seen; bitwise ops are 32-bit signed — fine for n ≤ 30.

Common Bugs

Forgetting that the path may revisit nodes — implementing as Hamiltonian path (mask exactly indicates visited once) is wrong for LC 847; use the right transition mask | (1 << v) (idempotent on already-visited nodes).
Initializing only one start state instead of all N — gives wrong answers because the optimal path may not start at node 0.
Returning the first full-mask state encountered without distance: BFS guarantees minimal distance only because of FIFO ordering — correct here, but easy to swap for DFS by accident.
Using mask & (1 << v) as a boolean test in C/Java — works, but in JS/Python be explicit: (mask >> v) & 1.
Allocating seen = [[False] * n] * (1 << n) (shared row reference) — Python beginner trap.
Off-by-one on full = (1 << n) - 1 vs (1 << n).

Debugging Strategy

For N=4 star, hand-simulate: start (1, 0=center) at d=0, expand to neighbors {1,2,3} → states (11, 1), (101, 2), (1001, 3) at d=1. From (11, 1), can go back to 0 → (11, 0) at d=2. From (11, 0) expand to 2 or 3 → (111, 2) or (1011, 3) at d=3. From (111, 2) go to 0 → (111, 0) at d=4. From (111, 0) to 3 → (1111, 3) at d=5. But the expected answer is 4! The min path is starting from a leaf: start (2, 1) → (3, 0) → (7, 2) → (15, 3)? Wait, going 1 → 0 → 2 → 0 → 3 is 4 edges. Let me recount: starts (2, 1) at d=0, → (3, 0) at d=1, → (7, 2) at d=2, → (7, 0) at d=3, → (15, 3) at d=4. Yes, 4. The issue with my earlier trace was starting from center.

If your code returns 5, you forgot to seed BFS with all start states.

Mastery Criteria

Recognized “bitmask DP” within 60 seconds when N ≤ 20 and state involves subsets.
Wrote shortestPathLength from scratch in <20 minutes.
Stated state space size and confirmed feasibility for the given N.
Solved one cousin (LC 1125, LC 943, LC 691) from cold start.
Used proper bit operations (no string-based mask handling).
Articulated when bitmask DP fails (N > 22 → 2^N too large).

LeetCode Interview — Extreme Coding