p101 — LFU Cache

Source: LeetCode 460 · Hard · Topics: Design, Hash Table, Doubly Linked List Companies (2024–2025): Meta (very high), Google (high), Amazon (high), Stripe (medium) Loop position: The canonical “LRU was the warm-up” follow-up. If LRU was Week 4 (p16), this is the Senior+ retort: “now evict by frequency, still O(1).”

1. Quick Context

LFU = Least Frequently Used. When the cache is full, evict the key with the lowest access count. Tie-break by LRU within that count. All ops must be O(1).

State (canonical O(1) design):

  • key_to_node: dict[key, Node] where Node holds (key, val, freq).
  • freq_to_dll: dict[freq, DoublyLinkedList] — each list orders its nodes by recency (head = most recent).
  • min_freq: int — pointer to the current minimum frequency.

The trick that makes this O(1): min_freq never needs to scan. It only increments (on _bump) or resets to 1 (on new put).

🔗 https://leetcode.com/problems/lfu-cache/

STOP. 25-min timer. If you reach for heapq or SortedDict, you’re already wrong.

3. Prerequisite Concepts

  • LRU Cache (p16) — the OrderedDict / dict-of-DLL pattern.
  • Doubly-linked list with O(1) remove(node) given a node pointer.
  • The “bucket by attribute, LRU within bucket” pattern (also used in p105 All O(1)).

4. How to Approach (FRAMEWORK 1–9)

1. Restate: Cap-bounded cache; on full, evict min-frequency key (LRU on tie); get and put are O(1).

2. Clarify: What if capacity=0? All puts are no-ops, get always -1. What does put of existing key do? Updates val AND counts as one access.

3. Examples: cap=2; put(1,1), put(2,2), get(1)→1 [freqs: 1→2, 2→1], put(3,3) → evicts 2 (min freq=1), get(2)→-1.

5. Brute: dict[key, (val, freq, last_tick)]; on evict, scan for min (freq, last_tick). O(n) per op.

6. Target: O(1) per op.

7. Pattern: Two-level index — by frequency, then by recency. min_freq increments naturally.

8. Develop: see solution.py.

9. Test: cap=0; cap=1; ties on freq; min_freq jump when the only freq-1 bucket empties.

5. Progressive Hints

See hints.md.

6. Deeper Insight

6.1 Why min_freq is O(1) without scanning

min_freq only changes in three places:

  1. New insert (put of unseen key)min_freq = 1 always (the new key has freq 1).
  2. _bump empties the min_freq bucketmin_freq += 1 (the node bumped from min_freq became min_freq+1; nothing else has frequency strictly less).
  3. Capacity-zero / never-set → undefined; never read.

There is no path where min_freq needs to search for the next non-empty bucket. This is the single insight that makes O(1) work.

6.2 Why we need a doubly-linked list, not just a list of keys

We need O(1) remove(node) given only the node. Singly-linked list requires a parent scan (O(n)). Python’s OrderedDict provides this internally as a dict + DLL hybrid — you can use OrderedDict per frequency bucket and call popitem(last=False) for O(1) LRU eviction.

6.3 LRU vs LFU — production tradeoffs

PropertyLRULFU
One-hit wondersPollute the cacheEvicted quickly (good)
Suddenly-cold “hot” keysEvict (responsive)Sticky (bad — old hot keys hog cache)
ImplementationTrivial (OrderedDict)The above 60 lines
Production fix for LFU stickinessTinyLFU / LFU with aging / W-TinyLFU (Caffeine, Linux page cache)

6.4 W-TinyLFU (Caffeine, the JVM cache library)

Production caches don’t use plain LFU because of the “stickiness” problem. W-TinyLFU uses a small Window LRU in front of an admission filter (Count-Min Sketch estimating frequency) plus the main LFU. New entries enter the window LRU; on eviction from window, they compete against the LFU victim — admission is granted only if the new entry’s estimated frequency exceeds the victim’s. Result: handles bursts AND long-tail well. This is what Java’s Caffeine library does (the de facto JVM cache).

6.5 Family: cache eviction

  • LC 146 LRU Cache (p16).
  • LC 460 LFU Cache (this).
  • LC 588 Design In-Memory File System (p100) — different concern.
  • LC 1756 Design Most Recently Used Queue — LRU variant.

7. Anti-Pattern Analysis

  1. Use a heapq of (freq, tick, key) — heap ops are O(log n), not O(1). And you can’t update a key’s freq without lazy-deletion gymnastics. Downlevel signal at Senior+.
  2. Use SortedDict / bisect — same problem; O(log n) per op.
  3. Scan for min_freq — O(n) on every eviction.
  4. Forget that put of existing key counts as an access — your get-then-put sequence breaks LRU tie-break.
  5. Forget LRU tie-break within a frequency — interviewer’s hidden test will hit this.
  6. min_freq not updated when the min bucket empties — silent wrong eviction.
  7. Don’t handle capacity == 0 — crash on first put.

8. Skills & Takeaways

  • Two-level “bucket by attribute, LRU within bucket” pattern (reused in p105).
  • min_freq invariant: monotonic-ish pointer that never scans.
  • Why O(log n) is a downlevel signal on cache hot paths.

9. When to Move On

  • Implement cold in 25 min, all ops O(1), no heapq/SortedDict.
  • Articulate why min_freq is O(1) without scanning.
  • Name W-TinyLFU as the production fix for LFU stickiness.

10. Company Context

  • Meta: L5/E5; appears in the “design but coding” round. Recruiters publish it on the prep list.
  • Google: L4/L5; “Show me you can build a real cache. LRU is too easy.”
  • Amazon: SDE2/SDE3; OE round.
  • Stripe: Reliability/Infra teams; “we cache payment-method tokens — design eviction.”

Strong: “Bucket by frequency, doubly-linked list per bucket, min_freq pointer that only ever increments or resets to 1. Both get and put strictly O(1).”

Weak: “I’ll use a min-heap of (freq, time, key).” — interviewer marks “doesn’t know O(1) is achievable.”

11. Interviewer’s Lens

SignalJuniorMidSeniorStaff+
O(1) achievedAfter heavy hintsAfter hintsColdCold
Reaches for heapYesSometimesNoNo (and says why not)
LRU tie-breakForgetsMentionsImplementsImplements + tests for it
min_freq invariantDoesn’t seeGuessesArticulatesProves the invariant
W-TinyLFU / CaffeineKnows ofCites and contrasts

12. Level Delta

  • Mid (L4): Gets to working LFU using heapq with lazy deletion. Works; not O(1). Hire-with-reservations.
  • Senior (L5): Cold O(1) implementation with DLL-per-freq and correct min_freq. Articulates LRU tie-break. Strong hire.
  • Staff (L6): + explains why O(log n) is a downlevel signal on hot paths; + cites W-TinyLFU as the production fix; + benchmarks both with the interviewer.
  • Principal (L7+): + discusses admission policy (TinyLFU’s Count-Min Sketch) vs eviction policy; + production load patterns where each wins (one-hit-wonder traffic → favor frequency; bursty hot → favor recency; web cache → hybrid). + mentions that Linux’s page cache is essentially 2-list LRU with promotion (active vs inactive LRU lists), a different attack on the same stickiness problem.

13. Follow-up Q&A

Q1: Make it thread-safe. A1: Coarse: single mutex around get/put. Better: read-write lock with reader-mostly assumption is wrong here (every get mutates). The realistic answer is shard the cache by hash(key) % N so contention is reduced, with one mutex per shard. Caffeine’s lock-free implementation uses async eviction + ring buffers per thread for get-recording.

Q2: Aging — old “hot” keys stay sticky. Fix? A2: Periodically halve all frequencies (decay). Or use W-TinyLFU with Count-Min Sketch that has built-in reset. Or 2Q (a queue-based LRU variant — Linux page cache).

Q3: Persist across restarts. A3: Write the key-value AND frequency to a log on put. On restart, rebuild. Caches usually don’t bother — recompute is the point of a cache.

Q4: Distributed — eviction across N nodes. A4: Consistent hashing maps keys to nodes (each node runs local LFU). Frequencies aren’t shared across nodes (too chatty). Total fleet capacity = N × per-node capacity.

Q5: Why not Python OrderedDict per frequency bucket? A5: You absolutely should — it’s a built-in dict+DLL. popitem(last=False) is the O(1) LRU evict. This solution shows the manual DLL for clarity but the OrderedDict version is what you’d write in 10 minutes in an interview.

14. Solution Walkthrough

See solution.py:

  • LFUCache — canonical DLL-per-frequency with min_freq pointer.
  • LFUCacheOrderedDictOrderedDict-per-frequency variant (concise, idiomatic Python).
  • LFUCacheBrute — O(n) scan oracle for stress testing.
  • Stress test: random get/put sequences, compare outputs against the brute.

15. Beyond the Problem

Principal code review: “Plain LFU has a well-known failure mode in production: stickiness. A page that was hot for a week and then went cold sits in the cache forever, blocking the actually-active long tail. Your min_freq pointer never makes progress because the cold-but-high-frequency keys keep min_freq at 1 (where the new arrivals live). Every production cache you depend on — Caffeine (JVM), Linux page cache (2-list active/inactive LRU with promotion), Redis LFU mode (uses logarithmic counter with decay), memcached LRU+slab — has solved this exact problem in three families of ways: (1) admission filter (TinyLFU, Count-Min Sketch — decide whether a new entry deserves to evict an existing one), (2) counter decay (Redis halves counters periodically; Linux ages by clock), (3) two-queue / segmented (Linux’s active/inactive lists; 2Q algorithm). The Staff+ signal in this round is that you implement the textbook LFU cleanly AND name at least one of those production patches without prompting. The interviewer is testing whether you’ve ever read the source of Caffeine or the Linux MM subsystem — the answer that gets you the level is yes, and here’s the tradeoff.”