Lab 09 — Meet in the Middle
Goal
Solve LC 1755 — Closest Subsequence Sum — via meet-in-the-middle: split the array into two halves, enumerate 2^(N/2) subset sums in each half, sort one half, and use binary search / two-pointer to find the pair-sum closest to the goal. Internalize the technique as the standard recipe whenever N is in the awkward zone 30 ≤ N ≤ 40 — too large for 2^N enumeration, too small for any polynomial DP.
Background Concepts
Meet in the middle (MITM) trades exponential time for exponential space, halving the exponent: instead of enumerating all 2^N subsets in one go (infeasible at N=40), enumerate 2^(N/2) subsets of each half (feasible at N=20: ~10⁶ each), then combine.
The combination step depends on the problem:
- Closest sum to goal (this lab): sort one half’s sums; for each sum
Lof the left half, binary-search the right half’s sums forgoal − L. - Count of subset-pairs with sum ≤ K: sort both halves; two-pointer.
- Find any subset summing to S: hashmap of one half’s sums; for each sum
L, check ifS − Lis in the map. - k-th smallest subset sum: more elaborate — heap-merge two sorted lists.
The asymptotics: O(2^(N/2) · N/2) to enumerate, O(2^(N/2) · log) for the combine. At N=40: 2^20 ≈ 10⁶, total ~2 × 10⁷. Feasible.
Interview Context
MITM is a niche but high-impact technique. Asked at Google (occasional), CP-flavored shops (frequent), and any problem set with N in [30, 45]. The signal: “N is around 30-40, brute force is 2^N, no polynomial DP visible because the state involves arbitrary subset sums”. Recognizing it converts a hopeless problem into a 30-minute solve. Not recognizing it caps you at “I would brute force but it TLEs” — a soft no-hire.
Problem Statement (LC 1755)
Given an integer array nums and integer goal, return the minimum absolute difference |sum(sub) − goal| over all non-empty subsequences (subsets) of nums. (LC 1755 allows the empty subsequence too — sum 0 — so empty is fine.)
Constraints
- 1 ≤ |nums| ≤ 40
- −10⁷ ≤ nums[i] ≤ 10⁷
- −10⁹ ≤ goal ≤ 10⁹
Clarifying Questions
- Subsequence = subset (unordered)? (Yes — LC’s “subsequence” here is order-independent because we only care about sum.)
- Empty subsequence allowed (sum 0)? (Yes per LC 1755.)
- Sums fit in 64-bit? (Max |sum| = 40 · 10⁷ = 4 × 10⁸ — fits in 32-bit Java
int. Use 64-bit defensively.)
Examples
nums = [5, -7, 3, 5], goal = 6 → 0 (5 + 3 - 5 + ... = 6 exactly via {5, 3, -7+5} = subset {5, -7, 3, 5}=6)
nums = [7, -9, 15, -2], goal = -5 → 1 (e.g., -9 + 7 = -2; |-2 - (-5)| = 3; better: -9 + 15 - 2 = 4; |-9 - (-5)|=4; -2 alone gives |-2-(-5)|=3; closest is -9 + 7 - 2 = -4, diff 1)
nums = [1, 2, 3], goal = -7 → 7 (closest is empty sum 0)
Initial Brute Force
Enumerate all 2^N subsets, compute each sum, track minimum |sum - goal|.
Brute Force Complexity
O(2^N · N). At N=40: 4 × 10¹³. TLE by 7 orders of magnitude.
Optimization Path
DP by sum? Sums range over [-4 × 10⁸, 4 × 10⁸] — too wide for a value-indexed DP. So polynomial DP is out.
O(2^(N/2)) enumeration: 2^20 ≈ 10⁶ per half. Feasible in time, requires the combine step.
For closest-sum-to-goal, sort one half’s sums; for each L in the other half, binary-search goal - L; check the two candidates around the insertion point. Total: O(2^(N/2) · N/2 + 2^(N/2) · log(2^(N/2))) = O(2^(N/2) · N).
Final Expected Approach
- Split
numsinto halvesA(first N/2) andB(last N - N/2). - Enumerate all subset sums of
A: listsumsAof size2^|A|. - Enumerate all subset sums of
B: listsumsBof size2^|B|. - Sort
sumsB. best = min(|s - goal|) for s in sumsA(handles the case where the right side contributes 0 — but since we include 0 as a subset sum of B, this is captured by step 6).- For each
ainsumsA: binary-searchsumsBforgoal - a; checksumsB[idx]andsumsB[idx-1](the two closest); updatebest. - Return
best.
Data Structures Used
- Two flat lists of subset sums.
bisect(Python) /Arrays.binarySearch(Java) /sort.Search(Go) /lower_bound(C++).
Correctness Argument
Every subset of nums decomposes uniquely as (left ∪ right) where left ⊆ A and right ⊆ B. So sum(subset) = a + b for some a ∈ sumsA, b ∈ sumsB. We want min_{a, b} |a + b - goal| = min_a min_b |b - (goal - a)|. For a fixed a, the inner min over b is solved by binary search in sorted sumsB: the closest element is at the insertion point or one position to its left. Iterating over all a ∈ sumsA covers all subsets.
Including 0 in both sumsA and sumsB covers the empty-side cases.
Complexity
| Quantity | Value |
|---|---|
| Enumerate sums | O(N · 2^(N/2)) |
| Sort | O(2^(N/2) · log(2^(N/2))) = O(N · 2^(N/2)) |
| Binary search loop | O(2^(N/2) · log(2^(N/2))) = O(N · 2^(N/2)) |
| Total time | O(N · 2^(N/2)) |
| Space | O(2^(N/2)) |
At N=40: ~4 × 10⁷ ops. Fits in 1 sec C++, ~3 sec Python.
Implementation Requirements
from bisect import bisect_left
def minAbsDifference(nums, goal):
def all_sums(arr):
sums = [0]
for x in arr:
sums = sums + [s + x for s in sums]
return sums
n = len(nums)
A = nums[: n // 2]
B = nums[n // 2 :]
sumsA = all_sums(A)
sumsB = sorted(all_sums(B))
best = abs(goal) # corresponds to the empty subset
for a in sumsA:
target = goal - a
idx = bisect_left(sumsB, target)
if idx < len(sumsB):
best = min(best, abs(a + sumsB[idx] - goal))
if idx > 0:
best = min(best, abs(a + sumsB[idx - 1] - goal))
if best == 0:
return 0
return best
Tests
- N=1,
[5], goal=5 → 0. - N=1,
[5], goal=0 → 0 (empty subset). - LC 1755 sample:
[5, -7, 3, 5], goal=6 → 0. - LC 1755 sample 2:
[7, -9, 15, -2], goal=-5 → 1. - LC 1755 sample 3:
[1, 2, 3], goal=-7 → 7. - All zeros: any goal →
|goal|. - Single huge value:
[10^7] * 40, goal=0 → 0 (empty). - Adversary: random N=40 with random values; cross-check against brute force at N=20.
Follow-up Questions
- “Now I need to count subsets with sum exactly
S.” → enumerate sums of both halves; for each sumain A, count occurrences ofS - ain B (bucket by value or use a Counter). - “Now I need subsets with sum in
[L, R].” → sort B; for eacha, binary-search the count of B-elements in[L - a, R - a]using twobisectcalls. - “What if the array has 50 elements?” → 2^25 = 3 × 10⁷ — borderline. Memory at 8 bytes per sum is 256 MB. Need to drop to bitset or stream.
- “Subset-product instead of sum?” → enumerate products; the combine is identical.
- “k-th smallest subset sum across all subsets.” → k-way merge using a min-heap from sorted subset-sum lists per half.
Product Extension
Cryptographic key knapsack (Merkle–Hellman) and certain integer programming problems with ~40 binary variables: MITM is the textbook attack / solver. Portfolio optimization with a small basket of asset switches; molecular conformation enumeration. Whenever you have a “binary vector with cost”, N ≈ 40, and no obvious polynomial structure: MITM is the move.
Language/Runtime Follow-ups
- Python: list-comprehension enumeration as shown is clean. For tighter constants use
numpyto compute all subset sums via repeatedconcatenate(s, s + x). - Java:
long[] sumsA, sumsB.Arrays.sortandArrays.binarySearchare O(log) per call. Watch heap pressure at 2 × 10⁶ longs ≈ 16 MB. - Go:
sort.Slice(sumsB, ...);sort.SearchIntsfor binary search. - C++:
vector<long long>of size 2^20 = 8 MB each.std::sort,std::lower_bound. Idiomatic. - JS/TS:
Numberis safe to ±2⁵³; sums of ±4 × 10⁸ are tiny. Use plainArray+Array.prototype.sort((a, b) => a - b).
Common Bugs
- Splitting halves as
[:n/2]and[n/2:]but accidentally usingn // 2 + 1somewhere → mismatched sizes; you’ll miss subsets. - Forgetting to include 0 (empty subset) in either half — fix by initializing
sums = [0]. - Sorting only one half but binary-searching as if both are sorted, or vice versa.
- Initial
best = float('inf')is fine, but initialbest = abs(goal)is more honest about the empty-subset case. - After binary search, only checking
sumsB[idx]and missingsumsB[idx-1](the next-smaller) — the closest element can be on either side. - Using a
setinstead of sorted list — kills the binary search.
Debugging Strategy
For small N=4, enumerate all 16 subset sums by hand and verify the MITM result. Print sumsA, sumsB (sorted) and walk one binary search by hand. If the result is consistently too large by some |x|, you forgot to include 0; if too small, you’re double-counting (e.g., overlapping halves).
Mastery Criteria
- Recognized N=30..45 + arbitrary subset-sum constraint as the MITM trigger within 90 seconds.
- Wrote MITM closest-sum from scratch in <25 minutes.
- Stated time complexity O(N · 2^(N/2)) and space O(2^(N/2)) without prompting.
- Solved LC 1755 in <30 minutes from cold start.
- Solved one cousin (LC 956 — Tallest Billboard, with a twist; or “find subset with sum closest to half”) from cold start.
- Articulated MITM’s failure point (N ≥ 50 → memory and time both blow up).