Lab 09 — Meet in the Middle

Goal

Solve LC 1755 — Closest Subsequence Sum — via meet-in-the-middle: split the array into two halves, enumerate 2^(N/2) subset sums in each half, sort one half, and use binary search / two-pointer to find the pair-sum closest to the goal. Internalize the technique as the standard recipe whenever N is in the awkward zone 30 ≤ N ≤ 40 — too large for 2^N enumeration, too small for any polynomial DP.

Background Concepts

Meet in the middle (MITM) trades exponential time for exponential space, halving the exponent: instead of enumerating all 2^N subsets in one go (infeasible at N=40), enumerate 2^(N/2) subsets of each half (feasible at N=20: ~10⁶ each), then combine.

The combination step depends on the problem:

  • Closest sum to goal (this lab): sort one half’s sums; for each sum L of the left half, binary-search the right half’s sums for goal − L.
  • Count of subset-pairs with sum ≤ K: sort both halves; two-pointer.
  • Find any subset summing to S: hashmap of one half’s sums; for each sum L, check if S − L is in the map.
  • k-th smallest subset sum: more elaborate — heap-merge two sorted lists.

The asymptotics: O(2^(N/2) · N/2) to enumerate, O(2^(N/2) · log) for the combine. At N=40: 2^20 ≈ 10⁶, total ~2 × 10⁷. Feasible.

Interview Context

MITM is a niche but high-impact technique. Asked at Google (occasional), CP-flavored shops (frequent), and any problem set with N in [30, 45]. The signal: “N is around 30-40, brute force is 2^N, no polynomial DP visible because the state involves arbitrary subset sums”. Recognizing it converts a hopeless problem into a 30-minute solve. Not recognizing it caps you at “I would brute force but it TLEs” — a soft no-hire.

Problem Statement (LC 1755)

Given an integer array nums and integer goal, return the minimum absolute difference |sum(sub) − goal| over all non-empty subsequences (subsets) of nums. (LC 1755 allows the empty subsequence too — sum 0 — so empty is fine.)

Constraints

  • 1 ≤ |nums| ≤ 40
  • −10⁷ ≤ nums[i] ≤ 10⁷
  • −10⁹ ≤ goal ≤ 10⁹

Clarifying Questions

  1. Subsequence = subset (unordered)? (Yes — LC’s “subsequence” here is order-independent because we only care about sum.)
  2. Empty subsequence allowed (sum 0)? (Yes per LC 1755.)
  3. Sums fit in 64-bit? (Max |sum| = 40 · 10⁷ = 4 × 10⁸ — fits in 32-bit Java int. Use 64-bit defensively.)

Examples

nums = [5, -7, 3, 5], goal = 6 → 0   (5 + 3 - 5 + ... = 6 exactly via {5, 3, -7+5} = subset {5, -7, 3, 5}=6)
nums = [7, -9, 15, -2], goal = -5 → 1  (e.g., -9 + 7 = -2; |-2 - (-5)| = 3; better: -9 + 15 - 2 = 4; |-9 - (-5)|=4; -2 alone gives |-2-(-5)|=3; closest is -9 + 7 - 2 = -4, diff 1)
nums = [1, 2, 3], goal = -7 → 7  (closest is empty sum 0)

Initial Brute Force

Enumerate all 2^N subsets, compute each sum, track minimum |sum - goal|.

Brute Force Complexity

O(2^N · N). At N=40: 4 × 10¹³. TLE by 7 orders of magnitude.

Optimization Path

DP by sum? Sums range over [-4 × 10⁸, 4 × 10⁸] — too wide for a value-indexed DP. So polynomial DP is out.

O(2^(N/2)) enumeration: 2^20 ≈ 10⁶ per half. Feasible in time, requires the combine step.

For closest-sum-to-goal, sort one half’s sums; for each L in the other half, binary-search goal - L; check the two candidates around the insertion point. Total: O(2^(N/2) · N/2 + 2^(N/2) · log(2^(N/2))) = O(2^(N/2) · N).

Final Expected Approach

  1. Split nums into halves A (first N/2) and B (last N - N/2).
  2. Enumerate all subset sums of A: list sumsA of size 2^|A|.
  3. Enumerate all subset sums of B: list sumsB of size 2^|B|.
  4. Sort sumsB.
  5. best = min(|s - goal|) for s in sumsA (handles the case where the right side contributes 0 — but since we include 0 as a subset sum of B, this is captured by step 6).
  6. For each a in sumsA: binary-search sumsB for goal - a; check sumsB[idx] and sumsB[idx-1] (the two closest); update best.
  7. Return best.

Data Structures Used

  • Two flat lists of subset sums.
  • bisect (Python) / Arrays.binarySearch (Java) / sort.Search (Go) / lower_bound (C++).

Correctness Argument

Every subset of nums decomposes uniquely as (left ∪ right) where left ⊆ A and right ⊆ B. So sum(subset) = a + b for some a ∈ sumsA, b ∈ sumsB. We want min_{a, b} |a + b - goal| = min_a min_b |b - (goal - a)|. For a fixed a, the inner min over b is solved by binary search in sorted sumsB: the closest element is at the insertion point or one position to its left. Iterating over all a ∈ sumsA covers all subsets.

Including 0 in both sumsA and sumsB covers the empty-side cases.

Complexity

QuantityValue
Enumerate sumsO(N · 2^(N/2))
SortO(2^(N/2) · log(2^(N/2))) = O(N · 2^(N/2))
Binary search loopO(2^(N/2) · log(2^(N/2))) = O(N · 2^(N/2))
Total timeO(N · 2^(N/2))
SpaceO(2^(N/2))

At N=40: ~4 × 10⁷ ops. Fits in 1 sec C++, ~3 sec Python.

Implementation Requirements

from bisect import bisect_left

def minAbsDifference(nums, goal):
    def all_sums(arr):
        sums = [0]
        for x in arr:
            sums = sums + [s + x for s in sums]
        return sums

    n = len(nums)
    A = nums[: n // 2]
    B = nums[n // 2 :]
    sumsA = all_sums(A)
    sumsB = sorted(all_sums(B))

    best = abs(goal)  # corresponds to the empty subset
    for a in sumsA:
        target = goal - a
        idx = bisect_left(sumsB, target)
        if idx < len(sumsB):
            best = min(best, abs(a + sumsB[idx] - goal))
        if idx > 0:
            best = min(best, abs(a + sumsB[idx - 1] - goal))
        if best == 0:
            return 0
    return best

Tests

  • N=1, [5], goal=5 → 0.
  • N=1, [5], goal=0 → 0 (empty subset).
  • LC 1755 sample: [5, -7, 3, 5], goal=6 → 0.
  • LC 1755 sample 2: [7, -9, 15, -2], goal=-5 → 1.
  • LC 1755 sample 3: [1, 2, 3], goal=-7 → 7.
  • All zeros: any goal → |goal|.
  • Single huge value: [10^7] * 40, goal=0 → 0 (empty).
  • Adversary: random N=40 with random values; cross-check against brute force at N=20.

Follow-up Questions

  1. “Now I need to count subsets with sum exactly S.” → enumerate sums of both halves; for each sum a in A, count occurrences of S - a in B (bucket by value or use a Counter).
  2. “Now I need subsets with sum in [L, R].” → sort B; for each a, binary-search the count of B-elements in [L - a, R - a] using two bisect calls.
  3. “What if the array has 50 elements?” → 2^25 = 3 × 10⁷ — borderline. Memory at 8 bytes per sum is 256 MB. Need to drop to bitset or stream.
  4. “Subset-product instead of sum?” → enumerate products; the combine is identical.
  5. “k-th smallest subset sum across all subsets.” → k-way merge using a min-heap from sorted subset-sum lists per half.

Product Extension

Cryptographic key knapsack (Merkle–Hellman) and certain integer programming problems with ~40 binary variables: MITM is the textbook attack / solver. Portfolio optimization with a small basket of asset switches; molecular conformation enumeration. Whenever you have a “binary vector with cost”, N ≈ 40, and no obvious polynomial structure: MITM is the move.

Language/Runtime Follow-ups

  • Python: list-comprehension enumeration as shown is clean. For tighter constants use numpy to compute all subset sums via repeated concatenate(s, s + x).
  • Java: long[] sumsA, sumsB. Arrays.sort and Arrays.binarySearch are O(log) per call. Watch heap pressure at 2 × 10⁶ longs ≈ 16 MB.
  • Go: sort.Slice(sumsB, ...); sort.SearchInts for binary search.
  • C++: vector<long long> of size 2^20 = 8 MB each. std::sort, std::lower_bound. Idiomatic.
  • JS/TS: Number is safe to ±2⁵³; sums of ±4 × 10⁸ are tiny. Use plain Array + Array.prototype.sort((a, b) => a - b).

Common Bugs

  1. Splitting halves as [:n/2] and [n/2:] but accidentally using n // 2 + 1 somewhere → mismatched sizes; you’ll miss subsets.
  2. Forgetting to include 0 (empty subset) in either half — fix by initializing sums = [0].
  3. Sorting only one half but binary-searching as if both are sorted, or vice versa.
  4. Initial best = float('inf') is fine, but initial best = abs(goal) is more honest about the empty-subset case.
  5. After binary search, only checking sumsB[idx] and missing sumsB[idx-1] (the next-smaller) — the closest element can be on either side.
  6. Using a set instead of sorted list — kills the binary search.

Debugging Strategy

For small N=4, enumerate all 16 subset sums by hand and verify the MITM result. Print sumsA, sumsB (sorted) and walk one binary search by hand. If the result is consistently too large by some |x|, you forgot to include 0; if too small, you’re double-counting (e.g., overlapping halves).

Mastery Criteria

  • Recognized N=30..45 + arbitrary subset-sum constraint as the MITM trigger within 90 seconds.
  • Wrote MITM closest-sum from scratch in <25 minutes.
  • Stated time complexity O(N · 2^(N/2)) and space O(2^(N/2)) without prompting.
  • Solved LC 1755 in <30 minutes from cold start.
  • Solved one cousin (LC 956 — Tallest Billboard, with a twist; or “find subset with sum closest to half”) from cold start.
  • Articulated MITM’s failure point (N ≥ 50 → memory and time both blow up).