p37 — Combinations
Source: LeetCode 77 · Medium · Topics: Backtracking Companies (2024–2025 frequency): Amazon (high), Google (medium-high), Meta (medium), Microsoft (medium), Apple (medium) Loop position: algorithm round, second slot. Tests whether the candidate distinguishes “permutation” from “combination” — a common confusion.
1. Quick Context
Given integers n and k, return all C(n, k) combinations of k numbers chosen from [1..n]. Order within a combination is non-decreasing by convention (and to deduplicate naturally).
The key distinction from permutations: a start index replaces used[]. Each recursion picks values from [start..n], ensuring monotonically increasing selection — which automatically deduplicates {1,2} and {2,1}.
def backtrack(start, path):
if len(path) == k:
result.append(path[:]); return
for i in range(start, n + 1):
path.append(i)
backtrack(i + 1, path)
path.pop()
Time: O(k · C(n, k)). Space: O(k) recursion.
The discriminator: the pruning optimization — for i in range(start, n - (k - len(path)) + 2) — which skips branches that can’t reach length k. Without it, you still iterate over impossible starts; with it, you cut significant work at large n.
2. LeetCode Link + Attempt Gate
🔗 https://leetcode.com/problems/combinations/
STOP. 15-min timer.
3. Prerequisite Concepts
- p36 (Permutations).
- The combinatorial identity
C(n, k) = C(n-1, k-1) + C(n-1, k)(Pascal’s rule). - Why monotonic order (
i < jfor path positionsi, j) deduplicates.
4. How to Approach (FRAMEWORK Steps 1–9)
Step 1 — Restate: “All k-subsets of {1, 2, …, n}, as lists.”
Step 2 — Clarify:
- “Numbers from 1 to n inclusive?” (Yes.)
- “k ≤ n?” (Yes per spec.)
- “Order within each combination?” (Non-decreasing — natural for dedup.)
Step 3 — Constraints: n ≤ 20, k ≤ n. C(20, 10) ≈ 184k. Manageable.
Step 4 — Examples:
n=4, k=2→[[1,2],[1,3],[1,4],[2,3],[2,4],[3,4]].n=1, k=1→[[1]].k=0→[[]].
Step 5 — Brute Force: Iterate all 2^n subsets, keep those of size k. O(2^n) — vastly worse than O(k · C(n,k)) for small k. State it.
Step 6 — Complexity: O(k · C(n, k)) time. O(k) recursion depth.
Step 7 — Pattern: “Choose k from n without repetition, unordered” → combinations template (start index).
Step 8 — Develop: see solution.py. Both naive and pruned loop bound.
Step 9 — Test: edge cases (k=0, k=n, n=1), stress vs itertools.combinations.
5. Progressive Hints
If stuck for 5 min, open hints.md.
6. Deeper Insight — The pruning that changes everything
6.1 The naive bound vs the tight bound
Naive: for i in range(start, n + 1). Iterates even when it’s IMPOSSIBLE to complete path to length k.
Tight: for i in range(start, n - (k - len(path)) + 2).
Derivation: we need to add (k - len(path)) more elements. The last possible starting value i is one that still leaves (k - len(path) - 1) elements above it in [i+1..n]. That requires n - i >= k - len(path) - 1, i.e., i <= n - (k - len(path)) + 1. So range’s upper bound (exclusive) is n - (k - len(path)) + 2.
For n=20, k=10, the naive variant explores ~3× more nodes than the pruned one. For larger n, the gap widens.
6.2 Why the start index suffices (no used[])
In permutations, we revisit “early” elements at deeper levels (e.g., [2,1,3] after [1,2,3]). In combinations, we never revisit — once we choose i, the next pick is from [i+1..n]. The strictly increasing structure makes {1,2} and {2,1} the SAME thing, and we generate only the canonical (non-decreasing) form.
6.3 LC 39 — Combination Sum (with reuse)
Same template, but backtrack(i, ...) (not i+1) — allows reusing the same element. Plus a target -= candidates[i] accumulator with target == 0 as the success condition and target < 0 as a prune.
6.4 LC 40 — Combination Sum II (with duplicates in input, no reuse)
Same template + the same dedup line as LC 47: if i > start and candidates[i] == candidates[i-1]: continue. Note i > start (not i > 0) — at the same recursion level, skip equal siblings; the first time at a level, take it.
6.5 LC 216 — Combination Sum III
k numbers from [1..9] summing to n. Pure combination of LC 77 + LC 39 logic.
6.6 Connection to Pascal’s triangle
C(n, k) = C(n-1, k-1) [contains n] + C(n-1, k) [doesn’t contain n]. This is the “include/exclude” framing of the same algorithm — the recursion tree has the same shape, just viewed differently. The include/exclude framing dominates subsets (p38).
7. Anti-Pattern Analysis
- Using
used[]as in permutations. Generates k! × C(n,k) = P(n,k) results with massive duplication. The dedup-at-end is exponentially wasteful. backtrack(start, ...)then recursing withstartinstead ofi+1. Allows reuse — that’s LC 39, not LC 77.- No pruning bound. Works but slow for large n.
- Generating all 2^n subsets and filtering by size k. O(2^n) — vastly slower for small k.
result.append(path)instead ofpath[:]. Same bug as p36.- Off-by-one on the range —
range(start, n)instead ofrange(start, n + 1)(1-indexed!). - For LC 39 (reuse), forgetting the
target < 0prune — the recursion explodes.
8. Skills & Takeaways
- Start-index template = the combination signature.
- Pruning bound:
n - (k - len(path)) + 2. Memorize. - Reuse variant = recurse with
i, noti+1. - Duplicates-in-input variant = sort +
if i > start and a[i] == a[i-1]: continue. - The include/exclude view (Pascal’s rule) — bridges to subsets (p38).
9. When to Move On
Move on when:
- You can write the basic combinations template in under 90 seconds.
- You can add the pruning bound from memory.
- You can write LC 39 and LC 40 in under 5 minutes each.
- Stress test of 200 random
(n, k)matchesitertools.combinations.
10. Company Context
Amazon:
- L5 staple. Combined with “now find combinations summing to target” (LC 39) is the standard 2-part question.
- Scorecard phrase: “Algorithm — used start-index template; pruning bound on second pass.”
Google:
- L4/L5. Often paired with LC 78 (Subsets) to test whether you see the family connection.
- Scorecard phrase: “Strong — clean template, articulated connection to subsets and permutations.”
Meta:
- Standard backtracking warm-up; rare as the sole question.
Microsoft:
- L63. Often combined with “now stream them” (generator variant).
Apple:
- Common in algo loops, particularly for ML / data teams (feature combinations).
11. Interviewer’s Lens
| Signal | Junior | Mid | Senior | Staff/Principal |
|---|---|---|---|---|
| Approach | Permute-and-dedupe | Start index, naive bound | Start index + pruning bound | + cites Pascal’s rule and the include/exclude view |
| Variants | Knows only LC 77 | + LC 39 | + LC 40 (dedup) | + LC 216 + connection to dynamic programming (knapsack family) |
| Output | Includes duplicates | Clean | Clean + complexity | + generator variant for memory |
12. Level Delta
Mid (L4):
- Start-index template, correct snapshot.
- Handles
k=0,k=n,n=1.
Senior (L5):
- Adds the pruning bound and explains the derivation.
- Solves LC 39 (reuse) and LC 40 (duplicates) on demand.
Staff (L6):
- Discusses the include/exclude framing and connection to DP / knapsack.
- Generator variant for streaming.
- Discusses when combinations are the wrong tool — e.g., k-th combination (LC 60 analogue) via combinatorial number system.
Principal (L7+):
- Discusses production: feature selection in ML, A/B test bucket enumeration, combinatorial test generation (pairwise / t-wise coverage tools like ACTS), portfolio combinatorial optimization.
- Discusses parallel enumeration via prefix-based work splitting.
- Discusses lattice / Bratteli diagrams for advanced combinatorial enumeration.
13. Follow-up Q&A
Q1: Combination Sum (LC 39) — same value usable infinitely.
A: Recurse with i instead of i + 1. Subtract from target; success when target == 0; prune when target < 0.
Q2: Combination Sum II (LC 40) — duplicates in input, each used once.
A: Sort + if i > start and a[i] == a[i-1]: continue + recurse with i + 1.
Q3: K-th combination of [1..n] choose k. A: Combinatorial number system. At each position, the i-th element of the k-th combination is determined by counting how many combinations start with each prefix. O(k · n) — no enumeration needed.
Q4: Generate combinations as a stream.
A: Python generator with yield list(path) instead of result.append(...). Same template.
Q5: Why not bitmask?
A: For k-combinations from n ≤ 32, enumerate bitmasks with exactly k bits set (Gosper’s hack: t = v | (v - 1); w = (t + 1) | (((~t & -~t) - 1) >> (v.bit_length() of (v & -v)))). O(1) per next-combination. Beautiful for n ≤ 64; backtracking is clearer.
14. Solution Walkthrough
See solution.py:
combine(n, k)— start-index template with pruning bound.combine_naive(n, k)— without pruning, for comparison.combine_brute(n, k)—itertools.combinationsoracle.edge_cases()+stress_test()— 200 random (n, k), all variants agree.
Run: python3 solution.py.
15. Beyond the Problem
Principal-engineer code review comment:
“Two notes. (1) The pruning bound
n - (k - len(path)) + 2is correct but reads like noise. Inline a one-line comment showing the derivation, and consider naming a local:max_first = n - need + 1. (2) If this is a hot path, switch to bitmask enumeration with Gosper’s hack — branchless, no recursion, ~10× faster for n ≤ 64. But ONLY for hot paths — for everything else, the recursion is more maintainable. (3) For very large n with sparse k (e.g., n=10000, k=3), the combinatorial number system gives O(k) per next-combination — useful when you want to sample or stream.”
Connections forward:
- p38 — Subsets: include/exclude view of the same template.
- p36 — Permutations: contrasting “all orderings” vs “unordered choice”.
- LC 39 — Combination Sum: reuse variant.
- LC 40 — Combination Sum II: duplicate-input variant.
- LC 216 — Combination Sum III: bounded-k + target.
- LC 90 — Subsets II: dedup version of p38.
- Phase 02 Lab 08 — Backtracking lab.
- Phase 03 Lab 08 — Bitmask DP: connection to bit-enumeration via Gosper’s hack.
- Production: feature selection (ML), A/B-test bucket design, combinatorial test coverage.