Lab 01 — Two Pointers: 3Sum (Deep Duplicate Handling)
Goal
Master the opposite-ends two-pointer pattern on a sorted array, with the specific discipline required for correct duplicate suppression. Deliverable solves 3Sum in O(N²) time, O(1) extra space, returning all unique triplets — and you can articulate, line by line, why each duplicate-skip is needed and what bug it prevents.
Background Concepts
Sorting as a precondition for two-pointer; loop invariants on (l, r); duplicate suppression at three loci (the outer i, the inner l, the inner r); the relationship between 3Sum and 2Sum-on-sorted; why hash-set deduplication is the wrong tool here. Review pattern Two Pointers and pattern Hashing.
Interview Context
3Sum is one of the single most asked problems in Big Tech onsite loops — it appears at Meta, Amazon, Bloomberg, and Apple. The signal is whether you can articulate the duplicate-handling logic before you code, not retroactively. Naive candidates produce O(N²) triplets and dedup via a hash set (set(tuple(sorted(triplet)))), which works but signals weak invariant reasoning. Strong candidates handle duplicates inside the two-pointer loop with three skip-clauses and explain each one’s purpose. Elite candidates also address the trade-off vs the hash-based 3Sum (when the input is unsorted and you can’t sort).
Problem Statement
Given an integer array nums, return all unique triplets [nums[i], nums[j], nums[k]] such that i, j, k are distinct indices and nums[i] + nums[j] + nums[k] == 0. The result must not contain duplicate triplets.
Constraints
3 ≤ N ≤ 3000-10^5 ≤ nums[i] ≤ 10^5- Output is a list of triplets in any order; each triplet’s internal order doesn’t matter.
- Triplets are deduplicated by value, not by index:
[-1, 0, 1]from indices(0, 2, 4)and from(1, 3, 5)count once.
Clarifying Questions
- Should
[-1, 0, 1]and[0, 1, -1]count as different triplets? (No — value-set duplicates.) - Are duplicates in the input array allowed? (Yes — many test cases hinge on this.)
- Can the input be empty / size < 3? (Per constraints, N ≥ 3, but mention you’d guard.)
- Can values exceed 32-bit range when summed? (Per constraints, max |sum| ≤ 3·10^5, safe in 32-bit, but you should mention overflow as a habit.)
- Is there a strict no-extra-space requirement, or is the output allocation OK?
Examples
| Input | Output |
|---|---|
[-1,0,1,2,-1,-4] | [[-1,-1,2],[-1,0,1]] |
[0,0,0,0] | [[0,0,0]] |
[1,2,-2,-1] | [] |
[] | [] |
[0] | [] |
Initial Brute Force
Three nested loops; check nums[i] + nums[j] + nums[k] == 0; dedup with a hash set of sorted triplets.
def three_sum_brute(nums):
out = set()
n = len(nums)
for i in range(n):
for j in range(i + 1, n):
for k in range(j + 1, n):
if nums[i] + nums[j] + nums[k] == 0:
out.add(tuple(sorted((nums[i], nums[j], nums[k]))))
return [list(t) for t in out]
Brute Force Complexity
Time O(N³). Space O(N²) worst case for out (number of unique triplets). At N=3000, ~2.7×10^10 operations — far too slow (>30 seconds in any language).
Optimization Path
The sub-problem after fixing nums[i] is 2Sum on the remaining sorted array with target -nums[i]. 2Sum-on-sorted is O(N) via opposite-ends two-pointer. Total: O(N) outer × O(N) inner = O(N²).
Why sort? Sortedness gives 2Sum-on-sorted an O(N) two-pointer algorithm; without sort, 2Sum is O(N) via hash but combining hash-2Sum with outer triplet enumeration makes duplicate-handling much trickier.
Final Expected Approach
def three_sum(nums):
nums.sort()
n = len(nums)
out = []
for i in range(n - 2):
if nums[i] > 0: break # early exit
if i > 0 and nums[i] == nums[i - 1]: continue # skip i-duplicate
l, r = i + 1, n - 1
while l < r:
s = nums[i] + nums[l] + nums[r]
if s < 0:
l += 1
elif s > 0:
r -= 1
else:
out.append([nums[i], nums[l], nums[r]])
l += 1; r -= 1
while l < r and nums[l] == nums[l - 1]: l += 1 # skip l-duplicate
while l < r and nums[r] == nums[r + 1]: r -= 1 # skip r-duplicate
return out
Data Structures Used
- The input array, sorted in place.
- An output list of triplets.
- Three integer indices (
i,l,r).
No hash structures; no auxiliary lists beyond output.
Correctness Argument
After sorting, fix i. The remaining array nums[i+1..n-1] is sorted, and we run the canonical opposite-ends 2Sum: when the sum is too small we advance l, when too large we retreat r, when equal we record and advance both. The standard 2Sum-on-sorted invariant proves that every value pair (nums[l], nums[r]) with l < r and nums[l] + nums[r] == target is found exactly once, in sorted order.
For duplicates: the three skip-clauses ensure that each distinct triplet by value is recorded exactly once.
if i > 0 and nums[i] == nums[i-1]: continue— the previous outer iteration already enumerated all triplets with first element equal tonums[i]. Without this,[-1,-1,0,1,2]would record[-1,-1,2]and[-1,0,1]twice (once for each-1as the outer pivot).while l < r and nums[l] == nums[l-1]: l += 1(after recording) — if multiplenums[l]values exist within(i, r), they all pair with the samenums[r]to give the same triplet. Skip them.while l < r and nums[r] == nums[r+1]: r -= 1(symmetric for the right pointer) — same reasoning.
The early exit if nums[i] > 0: break is correct because once the smallest element of the triplet is positive, no triplet sums to zero (the array is sorted).
Complexity
Time O(N²): O(N log N) sort + O(N²) total work in the nested two-pointer (each l and r move monotonically over a window of size O(N), and the outer i runs N times). Space O(1) extra (excluding output and the sort’s O(log N) recursion stack).
Implementation Requirements
- Sort in place (do not allocate a sorted copy).
- Skip duplicates inside the loop, not via a hash set on output.
- Use the early-exit
if nums[i] > 0: break(small but real speedup for typical inputs). - Move both pointers on a match, then run the skip loops — not before, or you’ll never advance off the matched element.
- Return a
List<List<Integer>>/list[list[int]]— not a set, not tuples.
Tests
- Smoke:
[-1,0,1,2,-1,-4]→[[-1,-1,2],[-1,0,1]]. - Unit: all-zeros (
[0,0,0,0]→[[0,0,0]]); no triplets ([1,2,3]→[]); minimum size ([0,0,0]→[[0,0,0]]). - Edge: size 0 / 1 / 2 →
[]; all-negative; all-positive (sum > 0 from index 0 → break immediately, return[]). - Large: N = 3000, values in
[-10^5, 10^5]; assert sub-second; verify count against the brute force on a 100-element prefix. - Random: generate 50 random inputs of size ≤ 200; compare against the brute force solution as oracle.
- Adversarial:
[0]*3000(all zeros — exactly one triplet[0,0,0]); long run of duplicates of a single value. - Invalid: non-integer / null input — out of scope per constraints, but mention you’d validate at the boundary.
Follow-up Questions
- “What about kSum?” → recurse:
kSum(nums, target, k)calls(k-1)Sum(remaining, target - nums[i]), base case is 2Sum-sorted. Time O(N^(k-1)). - “What if the array is unsorted and you cannot sort it?” → 2Sum-with-hash inside an outer enumeration; duplicate handling becomes per-output-set deduplication, more memory.
- “What if values repeat extremely (e.g., 99% zeros)?” → the duplicate skips handle this in O(N²) worst case, but in practice each outer iteration is O(1) for the duplicate values; you’d see a near-O(N) effective runtime on that adversarial input.
- “Can you do better than O(N²)?” → not under standard reductions: 3Sum has a known conditional lower bound of Ω(N²) (3SUM-hardness conjecture). Subquadratic 3Sum implies subquadratic many problems in computational geometry.
- “What about returning closest triplet to a target?” → 3SumClosest variant; same skeleton, track the best
|s - target|seen.
Product Extension
Detecting fraud rings of size 3 in a transaction graph where the signed sum of three transactions must net to zero (cancel out), under the constraint that the transactions hashed-distinctly. The same skeleton — fix one transaction, two-pointer the rest by signed amount — works, with the wrinkle that “duplicate” must be defined carefully (transactions are distinct by ID even if amounts are equal, so the duplicate skipping is replaced by a per-amount enumeration that emits all distinct ID combinations summing to zero).
Language/Runtime Follow-ups
- Python:
nums.sort()is in place. Usenums.append([nums[i], nums[l], nums[r]])notnums.append((..))if a list-of-lists is required (LC accepts either, but the contract is list). - Java: sort with
Arrays.sort(nums)— note this is dual-pivot quicksort for primitives, average O(N log N). ForInteger[]it’s TimSort. UseList<List<Integer>>and addArrays.asList(a, b, c)per triplet —Arrays.asListreturns a fixed-size list, which is fine because we never mutate it. - Go:
sort.Ints(nums)sorts in place; the rest is pointer arithmetic over indices. - C++:
std::sort(nums.begin(), nums.end()). Usestd::vector<std::vector<int>>for output. Beware: integer additionnums[i] + nums[l] + nums[r]can overflow 32-bit if value bounds were larger; with the given constraints it’s safe, but as a habit, uselong long. - JS/TS:
Array.prototype.sort()defaults to lexicographic comparison — this is a top-3 JS interview bug. Usenums.sort((a, b) => a - b). Also,a - bcan overflow 32-bit if values exceed ±2^30; for very large values useMath.sign(a - b). - Adversarial: sorting is the dominant constant; if the input is already sorted (best case for many sorts) this is faster than the brute by another factor.
Common Bugs
- Forgetting the
nums[i] == nums[i-1]skip on outer — produces duplicate output triplets like[-1,-1,2]repeated. - Forgetting to advance pointers on match —
l += 1; r -= 1must come before the inner duplicate-skip loops; otherwisenums[l] == nums[l-1]is comparingnums[l]to itself and the skip loop runs forever (well, runs untill == r, but produces no progress on the matched value). - Using
<vs<=inwhile l < r—l == rwould mean the same index appearing as both pointers, which is invalid (i, j, k must be distinct indices). - JS lexicographic sort —
[-1, -1, 2, -4, 0, 1]after default sort is[-1, -1, -4, 0, 1, 2](string-sorted). Always pass a numeric comparator. - Missing 32-bit overflow in C++/Java/Go when constraints allow large values. With LC-3Sum’s constraints it doesn’t bite, but the habit costs nothing and saves you on related problems (4Sum, kSum-with-target).
- Hash-set deduplication on output — works, but signals you didn’t internalize the invariant. Time still O(N²) but space O(N²) instead of O(1).
- Sorting twice by accident (once explicitly, once implicit in a downstream API) — innocuous but signals carelessness.
Debugging Strategy
- Run on
[0,0,0,0]first — should return[[0,0,0]]. If you get[[0,0,0],[0,0,0]], your outer-skip is broken. If you get[], your inner loop never matches (likelyl < rtypo). - Run on
[-2,0,0,2,2]. Expected:[[-2,0,2]]. If you get[[-2,0,2],[-2,0,2]], your inner-rskip is broken. - Diff against the brute force on 50 random inputs of size 50. The answers should match as sets (order of triplets and inner order doesn’t matter; sort each triplet and the list of triplets to compare).
- Time a 3000-element random input. Should run in well under a second in Python; under 50ms in C++ / Java / Go.
Mastery Criteria
- Recognized the signal “sorted, find triplet summing to target” as 3Sum within 30 seconds of reading.
- Articulated the three skip-clauses before writing them.
- Wrote a correct implementation in under 8 minutes, no lookup.
- Passed the all-zeros, single-duplicate, and large-N tests on the first attempt.
- Stated the conditional Ω(N²) lower bound when asked “can you do better?”.
- Identified the JS lexicographic-sort trap (or its language equivalent) without prompting.
- Generalized verbally to kSum and to 3SumClosest.