p26 — 3Sum

Source: LeetCode 15 · Medium · Topics: Array, Two Pointers, Sorting Companies (2024–2025 frequency): Meta (very high), Amazon (very high), Google (high), Microsoft (high), Bloomberg (high), Apple (medium), LinkedIn (medium), Adobe (medium) Loop position: first round of onsite. Used as “warm up that filters.” Failing this signals lack of fluency.

1. Quick Context

Given an integer array, return all unique triplets [a, b, c] such that a + b + c == 0. Unique means the multiset of returned triplets has no duplicates.

This problem is the canonical “sort + converging two-pointer” pattern. It’s the highest-frequency Medium in the Meta and Amazon loops because it tests THREE separate skills in 15 minutes:

  1. Pattern recognition: “find pairs/triplets with sum” → sort + two-pointer, not brute O(N³).
  2. Dedup discipline: the array can contain duplicates. The output must not. Three places to dedup correctly (outer i, inner lo, inner hi); off-by-one in any of them fails.
  3. Off-by-one resilience under pressure: while lo < hi, not <=. After finding a triplet, BOTH pointers must advance.

The interviewer’s signal isn’t “did you solve it” — it’s “did you solve it FAST and CLEAN.” Senior+ writes it in 12 minutes without a bug; Mid writes it in 22 minutes with one off-by-one to fix.

The classic anti-pattern: dedup using set(tuple(sorted(triplet)) for triplet in results). It works, but loses the sorted-array dedup-by-skip signal that the interviewer wants to see.


🔗 https://leetcode.com/problems/3sum/

STOP. 25-min timer. Sort + converging two-pointer; dedup by skip. Don’t peek.


3. Prerequisite Concepts


4. How to Approach (FRAMEWORK Steps 1–9)

Step 1 — Restate: “Given an integer array nums, return all UNIQUE triplets (i, j, k) with i < j < k such that nums[i] + nums[j] + nums[k] == 0. Returned triplets should be the VALUES, deduplicated as multisets, not the indices.”

Step 2 — Clarify:

  • “Sorted output required?” (LC: no, but nums is freely mutable so sorting is fine.)
  • “Can input have duplicates?” (LC: yes. Output triplets must not duplicate.)
  • “What counts as ‘duplicate triplet’?” (Same multiset of values. [-1, 0, 1] and [0, 1, -1] are duplicates.)
  • “Empty / fewer than 3 elements?” (Return [].)
  • “Integer range?” (LC: -10⁵ to 10⁵. No overflow concerns in Python.)

Step 3 — Constraints: N ≤ 3000 for LC. Brute O(N³) = 2.7×10¹⁰ — TLE. O(N²) = 9×10⁶ — fast. Need O(N²).

Step 4 — Examples:

  • [-1, 0, 1, 2, -1, -4][[-1, -1, 2], [-1, 0, 1]]
  • [0, 0, 0][[0, 0, 0]]
  • [0, 0, 0, 0][[0, 0, 0]] (dedup!)

Step 5 — Brute Force: Three nested loops, hash-set dedup. O(N³) time, O(N²) extra. State, then pivot.

Step 6 — Complexity (optimal): O(N²) time, O(1) auxiliary (output excluded), O(N log N) for the sort (subsumed).

Step 7 — Pattern Recognition: “Find all triplets summing to target” → fix one element, two-sum the rest with sorted converging two-pointer. Sort first. For each i, find pairs (lo, hi) with nums[lo] + nums[hi] == -nums[i]. Dedup by skipping equal-value runs.

Step 8 — Develop:

def threeSum(nums):
    nums.sort()
    n = len(nums)
    out = []
    for i in range(n - 2):
        if i > 0 and nums[i] == nums[i-1]: continue  # outer dedup
        if nums[i] > 0: break                          # pruning: positives can't sum to 0 with bigger
        target = -nums[i]
        lo, hi = i + 1, n - 1
        while lo < hi:
            s = nums[lo] + nums[hi]
            if s == target:
                out.append([nums[i], nums[lo], nums[hi]])
                lo += 1; hi -= 1
                while lo < hi and nums[lo] == nums[lo-1]: lo += 1  # inner-left dedup
                while lo < hi and nums[hi] == nums[hi+1]: hi -= 1  # inner-right dedup
            elif s < target:
                lo += 1
            else:
                hi -= 1
    return out

Step 9 — Test: dup 0s ([0,0,0,0]), all negative, all positive, mixed, empty, just-2-elements.


5. Progressive Hints

If stuck for 5 min, open hints.md.


6. Deeper Insight — Why “sort, then converging two-pointer” is optimal

The lower bound for 3Sum is believed to be O(N²) — this is the famous 3SUM conjecture in computational geometry. Many problems reduce to 3SUM, and a sub-O(N²) solution would imply faster algorithms for several geometric problems (collinear points, GeomBase, etc.). As of 2024, the best known is O(N² / (log N / log log N)²) via Grønlund-Pettie — i.e., we have not even broken O(N²) by a polylog factor.

So when the interviewer asks “can you do better than O(N²)?” — the answer is “no, modulo a famous open conjecture in computational geometry.” Senior+ candidates name-drop this. Staff+ adds: “the GP bound shaves a polylog factor but isn’t practical.”

Why the move rule works: After sorting, fix nums[i]. For the remaining suffix, we want pairs summing to -nums[i]. At any state (lo, hi):

  • If nums[lo] + nums[hi] < target, no pair (lo, hi') for hi' < hi can reach target (sum would be even smaller). So lo MUST advance.
  • Symmetric for > case.
  • If ==, we found one. We must advance BOTH (advancing only one would mean staying at (lo+1, hi) where nums[lo+1] + nums[hi] is either bigger than target — hi must advance — or equal — we’d re-record the same triplet after dedup; so jointly advancing is correct).

Why dedup-by-skip beats hash-set: hash-set works in O(1) per insertion but loses the structural property of sortedness. The skip approach is O(1) too, but it’s a visible algorithmic technique that the interviewer reads as “candidate understands invariants.”


7. Anti-Pattern Analysis

  1. set of sorted tuples for dedup. Works, gets the answer, but loses the senior signal. Show skip-dedup explicitly.
  2. while lo <= hi instead of <. With <=, you’d consider (i, i, i) triplets which violate i < j < k. Subtle, hangs in interview.
  3. Forgetting to advance BOTH lo and hi after a hit. Infinite loop or duplicate triplet.
  4. Dedup only on i, not on lo/hi. Misses cases like [-2,0,0,2,2][-2,0,2] would appear twice.
  5. Breaking on nums[i] > 0 early without sorting first. Pruning is correct only if sorted.
  6. Hashing each nums[i] value to avoid outer dup. Works but adds a hash table; the skip is cleaner.
  7. Forgetting empty / n < 3. Doesn’t crash with for i in range(n-2) if n <= 2, but should be mentioned.
  8. Sorting and then INDEXING the original array. After sort, indices are different. If interviewer asks for indices (Two-Sum-like), you’d need to track original indices via a sort-of-pairs.
  9. N³ brute with if (a, b, c) not in seen. Membership in a list is O(N), so it’s O(N⁴). Use a set for seen.

8. Skills & Takeaways

  • Sort-then-pivot as a meta-pattern: when a problem allows multiple comparisons / dedup, sorting first often unlocks O(N²) from O(N³).
  • Converging two-pointer as the inner loop after fixing one index.
  • Dedup by skipping equal-value runs in sorted arrays.
  • Move-rule correctness arguments (“if s < target, no smaller hi' can fix it”).
  • The 3SUM conjecture as a piece of trivia that signals depth.
  • Pruning: once nums[i] > 0, break — any triplet would have positive sum.

9. When to Move On

Move on when:

  • You can write 3Sum in under 12 minutes with no bugs.
  • You can articulate the move rule’s correctness without notes.
  • You can write 4Sum (LC 18) by induction on the 3Sum solution in another 10 min.
  • Your stress test compares against the O(N³) brute on 200 random small arrays.
  • You can mention the 3SUM conjecture and what it implies.

10. Company Context

Meta:

  • Top 5 most-asked problem at Meta period. Almost certain to see it or a variant (3Sum Closest, 3Sum Smaller, 4Sum) in any algorithm round.
  • Watches for: time to first correct submission, dedup discipline, off-by-one errors.
  • Follow-ups always include “what if k=4?” and “what if we want the count, not the triplets?”
  • Scorecard phrase: “Strong — clean sort + 2-pointer, all three dedups, generalized to k-Sum.”

Amazon:

  • High-frequency in mid-loop “coding bar” round. Bar Raiser may add LC 18 (4Sum) as follow-up.
  • Watches dedup discipline obsessively. Failing dedup = “no hire.”
  • Scorecard phrase: “Algorithm + Dive Deep — recognized 2-pointer immediately, correct skip-dedup, discussed 3SUM lower bound.”

Google:

  • Often comes with theoretical follow-up: “is O(N²) optimal? prove or disprove.” Cite 3SUM conjecture.
  • May ask for the COUNT of triplets (no need to enumerate) — same algorithm, just increment counter.
  • Scorecard phrase: “Algorithm — solid + mentioned 3SUM lower bound.”

Microsoft:

  • Cleaner variant: just want the algorithm, less dedup gotcha.
  • Scorecard phrase: “Strong — sort + 2-pointer.”

Bloomberg:

  • Frames as financial: “find three trades with net P&L zero.” Same algorithm.
  • Often paired with LC 1 (Two Sum) in same session as the warmup-then-real-thing test.
  • Scorecard phrase: “Pragmatic — 2-pointer, financial application discussed.”

Apple:

  • Frames as data analytics: “find three sensor readings averaging to baseline.” May ask floating-point version (epsilon comparison).
  • Scorecard phrase: “Strong — clean 2-pointer, discussed numerical tolerance.”

LinkedIn:

  • Often a 30-min portion paired with another problem. Pressure round.
  • Scorecard phrase: “Solid under time pressure.”

11. Interviewer’s Lens

SignalJuniorMidSeniorStaff/Principal
First instinctN³ nested loops“I think two-pointer?”Sort + 2-pointer immediatelySort + 2-pointer + states pruning
DedupSet of tuplesSet of tuples; then skip after hintAll three skip-dedups, first trySkip-dedup + explains why hash-set is “less clean”
Move ruleDoesn’t explainHand-wavesProves: “smaller sum, lo must advance”Proves + connects to invariant
Complexity“I dunno”O(N²)O(N²), notes sort subsumedO(N²), notes 3SUM conjecture
k-Sum follow-upStuckRe-codes 4SumSketches generic k-Sum recursionImplements generic k-Sum

12. Level Delta

Mid (L4):

  • Sort + converging 2-pointer.
  • Dedup works (may use set of tuples).
  • One off-by-one to fix in interview.
  • States O(N²) complexity.

Senior (L5):

  • Skip-dedup at all three positions, first try.
  • Articulates move-rule correctness.
  • States complexity precisely including sort cost.
  • Handles edge cases proactively.

Staff (L6):

  • Generalizes to k-Sum via recursion in 10 minutes.
  • Discusses 3SUM conjecture.
  • Discusses 3Sum Closest, 3Sum Smaller variants.
  • Notes pruning optimizations (nums[i] > 0 break, nums[i] + nums[i+1] + nums[i+2] > 0 break).

Principal (L7+):

  • Discusses 3SUM-hardness as a reduction concept (problems reducible to 3SUM).
  • Discusses Grønlund-Pettie’s O(N² / (log N / log log N)²) bound — and why it’s not practical.
  • Discusses parallelization: 3Sum is embarrassingly parallel on i; mention map-reduce style.
  • Discusses floating-point variant numerical issues (Kahan summation, epsilon comparison).

13. Follow-up Q&A

Q1: 4Sum (LC 18) — pairs summing to target. A: Wrap one more outer loop, dedup it, then call the 2-pointer inner. O(N³). Generic k-Sum is O(N^(k-1)).

Q2: 3Sum Closest (LC 16) — return sum closest to target. A: Same structure. Track best_diff instead of equality. O(N²). When s == target, return immediately.

Q3: 3Sum Smaller (LC 259) — COUNT of triplets with sum < target. A: After sort, fix i. For two-pointer (lo, hi): if nums[lo] + nums[hi] < target, ALL (lo, lo+1..hi) work — add hi - lo to count, advance lo. Else advance hi. O(N²).

Q4: Can we do better than O(N²)? A: Open. 3SUM conjecture believes no. Best known is Grønlund-Pettie’s O(N² / (log N / log log N)²). A sub-quadratic solution would break decades of work in computational geometry.

Q5: What if nums is a stream (no random access)? A: We need 2 passes if 1-pass isn’t possible. First pass: collect into list (O(N) memory). Then sort + 2-pointer. If memory is constrained, this becomes external sort + then a different algorithm — challenging.


14. Solution Walkthrough

See solution.py:

  • three_sum(nums) — canonical sort + converging 2-pointer with skip-dedup.
  • three_sum_hashset(nums) — alternative using hash for inner two-sum (worse constants, same Big-O, kept for comparison).
  • three_sum_brute(nums) — O(N³) oracle for stress.
  • k_sum(nums, target, k) — generalized k-Sum via recursion (bonus).
  • stress_test() — 200 random arrays N≤15, all variants agree.
  • edge_cases() — empty, n<3, all zeros, duplicates, all negative, all positive, LC canonical.

Run: python3 solution.py → “ALL TESTS PASSED”.


15. Beyond the Problem

Principal-engineer code review comment:

“The skip-dedup is correct but please add INVARIANT comments. New hires copy this code into adjacent problems (4Sum, 3SumClosest, kSum) and the dedups get subtly wrong without the comments. Specifically: ‘invariant: after this advance, nums[lo] != nums[lo-1], so the next triplet will not be a duplicate of the just-recorded one.’ Also — the k-Sum recursion you wrote is correct but allocates a new list at each level; for our hot-path analytics service this matters. Use indices and accumulate into a single output list. Last point: the 3SUM conjecture is interesting trivia but please don’t put it in the docstring; the next on-call engineer doesn’t care.”

Connections forward:

  • Phase 02 — lab-01-two-pointers (this problem is the keystone example).
  • Phase 06 — Greedy has problems where sort-then-scan is the optimal structure; same skill.
  • 3SUM reductions show up in computational geometry: collinear points, distinct distances, GeomBase.
  • Production: financial reconciliation (3 trades netting to zero), sensor anomaly detection (3 readings summing to baseline), audit logs (3 events with combined impact 0).
  • Theoretical: 3SUM-hardness is a complexity class — many geometric problems are 3SUM-hard.