p42 — Top K Frequent Elements
Source: LeetCode 347 · Medium · Topics: Array, Hash Table, Heap, Bucket Sort, Quickselect Companies (2024–2025 frequency): Meta (very high), Amazon (very high), Google (high), Microsoft (high), Apple (medium), Uber (high), LinkedIn (high) Loop position: Phone screen and first round. The “bucket sort is faster” curveball — most candidates default to heap, the senior signal is recognizing bucket sort.
1. Quick Context
Given nums, return the k most frequent elements.
Three acceptable solutions:
| Approach | Time | Space | When to pick |
|---|---|---|---|
| Counter + sort by count | O(n log n) | O(n) | Baseline; never primary |
| Counter + min-heap of size K | O(n log K) | O(n + K) | Production default |
| Counter + bucket sort | O(n) | O(n) | Optimal — beats heap when count ≤ n |
The discriminator is recognizing bucket sort applies: counts are bounded by n (an element can appear at most n times), so a bucket array of size n+1 indexed by frequency lets you sweep top-K in linear time. Most candidates miss this.
2. LeetCode Link + Attempt Gate
🔗 https://leetcode.com/problems/top-k-frequent-elements/
STOP. 20-min timer. Aim to derive bucket sort.
3. Prerequisite Concepts
collections.Counterfor O(n) frequency counting.heapqand min-heap-of-size-K (p41).- The bucket-sort insight: when values are bounded by n, sort by indexing.
4. How to Approach (FRAMEWORK Steps 1–9)
Step 1 — Restate: “Count each element’s frequency; return K elements with highest counts. Order among the K is unspecified.”
Step 2 — Clarify:
- “Ties — which to break in favor of?” (LC: any; if asked, specify lex / first-seen / etc.)
- “k ≤ #distinct?” (LC: guaranteed.)
- “n bound?” (LC: 10^5; both heap and bucket comfortable.)
- “Are elements hashable?” (Yes — ints.)
- “Return order matters?” (LC: no.)
Step 3 — Constraints: n ≤ 10^5 → counts ≤ 10^5. Any of the three approaches comfortably passes.
Step 4 — Examples:
[1,1,1,2,2,3], k=2→[1, 2][1], k=1→[1][4,4,4,4,4], k=1→[4]
Step 5 — Brute Force: Counter → list of (count, val) → sort by count → take top K. O(n log n).
Step 6 — Complexity: Heap O(n log K); Bucket O(n).
Step 7 — Pattern: “Top K by frequency” → bucket sort wins when count is bounded by n; heap is general.
Step 8 — Develop: see solution.py.
Step 9 — Test: k=1, k=#distinct (all elements), all same, all distinct (every count = 1).
5. Progressive Hints
If stuck for 5 min, open hints.md.
6. Deeper Insight — Bucket sort is the surprise
6.1 The bucket-sort insight
Element counts are bounded: 1 ≤ count ≤ n. So you can create a buckets array of size n + 1 where buckets[c] is the list of elements with frequency c. Sweep c from n down to 1, collecting elements until you have K.
counter = Counter(nums)
buckets = [[] for _ in range(len(nums) + 1)]
for val, cnt in counter.items():
buckets[cnt].append(val)
result = []
for c in range(len(buckets) - 1, 0, -1):
for val in buckets[c]:
result.append(val)
if len(result) == k: return result
Time: O(n) for Counter + O(n) for bucket fill + O(n) for sweep = O(n) total.
This is the optimal answer. It beats heap O(n log K) asymptotically. Why isn’t it always used? Memory: bucket array is size n + 1 regardless of K, while heap is size K. For huge n and tiny K, the heap can be more cache-friendly.
6.2 Heap solution — when bucket sort doesn’t apply
For STREAMING input where you don’t know n in advance, bucket sort fails — you can’t pre-size the bucket array. Heap-of-size-K is the streaming-friendly answer.
counter = Counter(nums)
heap = []
for val, cnt in counter.items():
if len(heap) < k:
heapq.heappush(heap, (cnt, val))
elif cnt > heap[0][0]:
heapq.heappushpop(heap, (cnt, val))
return [val for cnt, val in heap]
Tuple (cnt, val) — min-heap orders by count (first tuple element). On ties, val breaks the tie (deterministic).
6.3 Quickselect on (val, count) pairs — O(n) expected
Use quickselect to find the K largest by count. Same algorithm as p41 but the comparison key is the count, not the value itself.
items = list(counter.items()) # [(val, count), ...]
# Quickselect on items, comparing by count, targeting position len(items) - k
quickselect(items, len(items) - k, key=lambda x: x[1])
return [val for val, _ in items[len(items) - k:]]
O(n) expected; same trade-offs as p41.
6.4 Why ties don’t matter (usually)
LC accepts any K elements with the top-K counts. If two elements tie for K-th place, either is acceptable. BUT: the production version often requires deterministic tie-breaking (lex order, first-seen, etc.) — confirm with the interviewer.
For lex tie-break: heap key becomes (cnt, val) directly works (min-heap → smallest cnt-then-val at root). For “first-seen” tie-break: include an insertion counter in the key.
6.5 LC 692 — Top K Frequent Words (tie-break variant)
Same problem with deterministic tie-break: “if two words have same frequency, the lexicographically smaller comes first.” This makes heap key tricky: we want (LARGER count, SMALLER word) to be at the heap root, but heapq is min-heap.
Trick: invert just the count: heapq.heappush(heap, (-cnt, word)). Then a min-heap pops smallest (-cnt, word) = largest cnt, lex-smallest word on ties. After collecting top K (with size-K heap), pop everything and reverse.
This is the kind of detail that distinguishes a senior from a mid-level candidate.
6.6 Counter is a hash table (and that matters)
collections.Counter is a subclass of dict. Frequency counting is O(n) expected but O(n²) worst case under adversarial hash collisions (theoretical only — Python’s hash is randomized per-process by default). For arbitrary objects (not just ints/strings), ensure __hash__ is well-defined.
7. Anti-Pattern Analysis
- Sort + slice as your only answer — O(n log n), misses the bucket-sort O(n) signal.
- Heap of size n (push all, pop K) — O(n log n). Wrong reflex: heap-of-size-K is O(n log K).
- Push counts, not (count, val) tuples — you lose track of which value the count refers to.
- Bucket sort with
defaultdict(list)instead of array — slower in practice for dense bucket fill; preallocated list is the right choice. - Bucket sort with size
max(counter.values()) + 1— works but requires a pre-scan;len(nums) + 1is upper bound and avoids the scan. Tiny win for clarity. - Forgetting LC 692 tie-break — push
(-cnt, word)not(cnt, word)for size-K heap with lex tie-break. - Mutating
counter.items()during iteration — fortunately we don’t, but a common bug elsewhere. - Returning a heap, not its values — extract with list comprehension.
- Not using
Counter— manualdefaultdict(int)loop works butCounteris idiomatic.
8. Skills & Takeaways
- Bucket sort O(n) when count is bounded by n.
- Min-heap of size K with
(count, val)tuples for streaming-friendly top-K. - Quickselect on items by count key for O(n) expected.
- LC 692 trick:
(-cnt, word)for deterministic lex tie-break in size-K min-heap. Counteris the idiomatic frequency-counter; subclass ofdict.- Three valid answers, deliberate pick: bucket (optimal), heap (streaming), quickselect (interview flair).
9. When to Move On
Move on when:
- You can write bucket-sort version in ≤ 7 min.
- You can write heap version in ≤ 5 min.
- You can articulate why bucket sort is O(n).
- You handle LC 692 tie-break correctly.
- Stress test on 500 random arrays agrees with sort baseline.
10. Company Context
Meta:
- Phone-screen staple. The bucket-sort recognition is the senior signal.
- Bar Raiser asks: “what’s the time complexity? Can you do better than O(n log K)?” — expects bucket sort.
- Scorecard phrase: “Algorithm — derived O(n) bucket sort; named heap and quickselect alternatives.”
Amazon:
- L5/L6 staple. Often combined with “top K customers by purchase count” framing.
- Bar Raiser tests “top K over a 1B-row table” → discusses partitioning + per-shard top-K.
Google:
- L4/L5. Often follows up with LC 692 (words + lex tie-break) for senior signal.
Microsoft:
- L63/L64. Common in search/ranking team interviews.
Uber/LinkedIn:
- “Top K trending hashtags” / “top K skills” framing. Streaming variant common — heap-of-size-K + sliding window.
11. Interviewer’s Lens
| Signal | Junior | Mid | Senior | Staff/Principal |
|---|---|---|---|---|
| First answer | Sort by count | Heap of size K | Bucket sort O(n) | + names quickselect + streaming caveat |
| Counter | Manual dict loop | Counter(nums) | + explains it’s a dict subclass | + adversarial hashing caveat |
| Tie-break | “Doesn’t matter” | “Use tuple key” | “(-cnt, val) for lex order” | + deterministic-output discussion |
| Streaming | “Re-sort each time” | “Heap-of-size-K” | + “bucket sort needs known n” | + sliding-window Misra-Gries / Space-Saving |
| LC 692 | Brute lex sort | Tuple key (basic) | (-cnt, word) for size-K min-heap, reverse at end | + count-min sketch for approximate large-stream variant |
12. Level Delta
Mid (L4):
- Heap solution clean with
(cnt, val)tuples. - States O(n log K).
- May miss bucket sort.
Senior (L5):
- Bucket sort first; names heap as streaming alternative.
- Handles LC 692 tie-break correctly.
- Mentions Counter internals.
Staff (L6):
- Discusses production: hot-key detection, trending topics, real-time leaderboard.
- Discusses approximate streaming algorithms (Misra-Gries, Space-Saving).
- Discusses sharded top-K for distributed systems.
Principal (L7+):
- Discusses count-min sketch + heap for bounded-memory streaming top-K with provable error.
- Discusses sliding-window top-K with lazy deletion.
- Discusses adversarial input: hash collisions, count overflow, memory exhaustion.
- Discusses A/B testing: how top-K rankings drift over time and the need for stable tie-break.
13. Follow-up Q&A
Q1: LC 692 — Top K Frequent Words with lex tie-break.
A: Use heapq with key (-cnt, word). Min-heap pops smallest (-cnt, word) = largest cnt, lex-smallest on ties. After collecting K, pop all and reverse to get descending order.
Q2: Streaming over an unbounded input? A: Heap-of-size-K is feasible only if #distinct values fits in memory. For unbounded distinct values, use Space-Saving algorithm (bounded-memory approximate top-K with provable error bounds).
Q3: Top K over a sliding window of W? A: Maintain count map per window position + heap with lazy deletion. Or: bucket of “active counts” per element.
Q4: K = n (return ALL elements by frequency)? A: Sort by count: O(n log n). No win from heap or bucket — must touch everything anyway.
Q5: Top K by SUM-of-values, not count?
A: Same template; replace Counter with grouped sum, then bucket / heap / quickselect on sums.
Q6: Top K across multiple shards? A: Per-shard top-K + merge with K · S-heap (S shards). Total O((K · S) log (K · S)).
14. Solution Walkthrough
See solution.py:
top_k_frequent_bucket(nums, k)— O(n) bucket sort.top_k_frequent_heap(nums, k)— heap of size K with(cnt, val)tuples.top_k_frequent_sort(nums, k)— baseline.top_k_frequent_words(words, k)— LC 692 with lex tie-break.edge_cases()+stress_test()— 500 random arrays; compare as SETS (order unspecified).
Run: python3 solution.py.
15. Beyond the Problem
Principal-engineer code review comment:
“Three things. (1) The bucket sort is asymptotically optimal but memory is O(n) regardless of K — for a service processing top-1 over n=10^9, this matters; use the heap. State the trade-off in the docstring. (2) LC 692’s
(-cnt, word)trick is the kind of code that breaks two years later when someone ‘simplifies’ it; add a comment with WHY the negation is there and a test that fails if you remove it. (3) For real top-K systems (trending hashtags, recommendation candidates), you’re almost never doing exact top-K over batched data — you’re doing approximate top-K over a stream with bounded memory. Mention Space-Saving / Misra-Gries / count-min sketch in the design doc; the interview question is the toy version of a much harder real problem.”
Connections forward:
- p41 — Kth Largest Element (sibling pattern).
- p44 — Merge K Sorted Lists (heap-over-K).
- LC 692 — Top K Frequent Words (lex tie-break).
- LC 973 — K Closest Points (heap with distance key).
- LC 451 — Sort Characters By Frequency (bucket sort revisited).
- Phase 02 Lab 09 — Heap Top-K (lab generalization).
- Production: trending hashtags, hot-key detection, recommendation candidates, search-suggestion ranking.