p08 — Linked List Cycle

Source: LeetCode 141 · Easy · Topics: Linked List, Two Pointers, Hash Table Companies (2024–2025 frequency): Amazon (high), Microsoft (high), Google (very high), Meta (high), Bloomberg (medium), Apple (medium) Loop position: phone screen; the lead-in to LC 142 (find cycle start), LC 287 (Find Duplicate Number), LC 202 (Happy Number)

1. Quick Context

The interviewer’s real question is: “Do you know Floyd’s tortoise-and-hare, and can you prove it terminates?” The HashSet solution gets you Mid-level credit. The two-pointer solution with a verbal correctness sketch gets you Senior. Articulating the modular-arithmetic proof gets you Staff. This is one of the rare problems where the justification matters as much as the code.

What it looks like it tests: detect a cycle in a linked list. What it actually tests: can you reason about an algorithm whose correctness is non-obvious? Can you explain why a slow and fast pointer must meet inside any cycle, instead of just asserting it?

2. LeetCode Link + Attempt Gate

🔗 https://leetcode.com/problems/linked-list-cycle/

STOP. Set a 12-minute timer. Code BOTH the HashSet version and Floyd’s. Do not read past this section until solved or expired.

If repeat: target 4 min for Floyd’s. If you reach for HashSet first reflexively, that’s a Mid signal — practice Floyd’s until it’s your reflex.

3. Prerequisite Concepts

Linked list traversal, the None terminator (phase-01 §4)
Two-pointer / fast-slow pattern (phase-02 §1 Two Pointers)
HashSet O(1) average membership; identity (id()) vs value as the set key — for nodes, identity is correct because two distinct nodes can have equal values
Modular arithmetic intuition: if two runners on a circular track move at different speeds, they will eventually be at the same point (gap closes by 1 each step in tortoise-and-hare)

4. How to Approach (FRAMEWORK Steps 1–9 applied)

Step 1 — Restate: “Given the head of a singly linked list, return True if any node’s next reaches a previously-visited node (a cycle), else False.”

Step 2 — Clarify:

“Empty or single-node input?” (Both possible; cycle requires at least 2 nodes including the self-loop edge case.)
“Can the cycle start at the head itself (entire list is a cycle)?” (Yes.)
“Is the cycle always reachable from head?” (Yes by definition — we only follow next from head.)
“Allowed to modify the list?” (Discuss; the in-place “mark with sentinel” approach exists but mutates input. Floyd’s doesn’t need it.)

Step 3 — Constraints: N up to 10^4 in LC. Both HashSet O(N) space and Floyd’s O(1) space fit; production prefers O(1).

Step 4 — Examples:

[3, 2, 0, -4] with tail → node at index 1 (3→2→0→-4→2→…) → True
[1, 2] with tail → head (1→2→1→…) → True
[1] no cycle → False
[1] self-loop (1→1) → True (single-node cycle — does Floyd’s handle it? Yes if you advance correctly.)
[] → False

Step 5 — Brute Force: HashSet of visited node IDs. Walk; if current node already in set → cycle; else add and advance. O(N) time, O(N) space. Use id(node), not node.val, to avoid collisions on equal values.

Step 6 — Brute Force Complexity: Time O(N), space O(N). Correct, but uses heap memory for what can be done in O(1).

Step 7 — Pattern Recognition: “Cycle detection in a sequence with O(1) memory” → Floyd’s tortoise and hare (slow/fast pointer). Generalizes to: cycle in iterated function f (LC 202, LC 287); duplicate detection in number sequences; Pollard’s rho factorization.

Step 8 — Optimize: Floyd’s. slow = head, fast = head. Loop: advance slow by 1, fast by 2. If fast or fast.next becomes None, no cycle → False. If slow == fast, cycle → True.

Step 9 — Prove correctness: If no cycle, fast reaches None in at most N/2 steps. If cycle exists, both pointers enter the cycle eventually. Once both are in the cycle of length C, the gap between them (measured as (fast - slow) mod C) increases by exactly 1 each iteration (fast gains 2, slow gains 1, net +1). Since the gap is bounded in [0, C-1], it must take the value 0 within C iterations — that’s when they meet.

5. Progressive Hints

If stuck for 5 minutes, open hints.md. One hint, 5-min timer, retry.

6. Deeper Insight — Why It Works

The pigeonhole / modular argument: consider only the cycle phase (after both pointers are inside the cycle of length C). Each step, slow advances by 1 mod C, fast by 2 mod C. Their relative position is (fast_pos - slow_pos) mod C, which changes by +1 mod C each step. Over C iterations, it takes every value in {0, 1, …, C-1} — so it hits 0 (collision) within C steps. There is no input that escapes detection.

Why 1 and 2 specifically: any pair of speeds (a, b) with gcd(b - a, C) = 1 would also work, but (1, 2) is the simplest: gap increment is always 1, gcd is always 1 (any C). Using (1, 3) would have a gap increment of 2; on even-length cycles, the gap might be permanently odd, never zero. The choice of speeds is not arbitrary.

Why fast and fast.next: fast moves 2, so we check BOTH that fast exists and that fast.next exists before doing fast = fast.next.next. Forgetting fast.next is the classic NoneType crash.

Why HashSet uses id(): the question is structural (same node visited twice), not value-based. Two distinct nodes can have equal val. Using node as the key works in Python (default __hash__ is identity-based), but id(node) is explicit and unambiguous.

Why this generalizes to numerical cycles: treat x → f(x) as a “next pointer” on integers. Then LC 202 (Happy Number — cycle in repeated digit-square-sum) and LC 287 (Find Duplicate Number — cycle in nums[nums[i]]) become structurally identical. The kernel is “Floyd’s on any iterated function.”

7. Anti-Pattern Analysis

Wrong-but-tempting #1 — HashSet on node.val:

seen = set()
while head:
    if head.val in seen: return True
    seen.add(head.val)
    head = head.next

Wrong. Two distinct nodes with the same value would falsely trigger “cycle detected”. For [1, 1, 1, 1] with no cycle (just repeating values), this returns True. Use id(head) or head directly as the set key.

Wrong-but-tempting #2 — Counter / step limit:

count = 0
while head:
    if count > MAX: return True
    head = head.next; count += 1

Heuristic: “if I walk more than 10^4 nodes, must be a cycle.” Fails on inputs that are legitimately longer than your bound and falsely flags. Doesn’t prove anything — interviewers reject as “didn’t think about correctness.”

Wrong-but-tempting #3 — Mutate the list with sentinel values:

while head:
    if head.val == SENTINEL: return True
    head.val = SENTINEL
    head = head.next

Destroys input. Some interviewers tolerate this if you ask permission first; many don’t. And in production, the input might be shared (immutable contract), so this corrupts state. Never the default choice.

Wrong-but-tempting #4 — fast = fast.next.next without null guard:

while slow != fast:
    slow = slow.next
    fast = fast.next.next   # crashes when fast or fast.next is None

Classic NoneType crash. Always guard while fast is not None and fast.next is not None BEFORE advancing fast by 2.

Wrong-but-tempting #5 — Start slow = head, fast = head.next: Off-by-one. The standard formulation is slow = fast = head, advance both at the top of each iteration, then compare. Starting fast ahead introduces an asymmetric first iteration and causes the meeting-point math (used in LC 142) to be wrong.

8. Skills & Takeaways

Generalizable patterns:

Floyd’s cycle detection: on linked lists, on iterated integer functions, on any sequence where you have f(x) = next. Examples: LC 142, 202, 287, 457.
Modular arithmetic for terminat ion proofs: when you can’t enumerate states, model the state space modulo cycle length.
id() for identity-based hashing: when the question is “same object?” not “same value?”

Analogous problems:

LC 142 — Linked List Cycle II (find the start of the cycle — the math gets beautiful)
LC 202 — Happy Number (Floyd’s on digit-square-sum iteration)
LC 287 — Find the Duplicate Number (Floyd’s on nums[nums[i]] — Hard, the constraint of O(1) extra space is what forces Floyd’s)
LC 457 — Circular Array Loop (Floyd’s with directional constraints)
LC 234 — Palindrome Linked List (uses slow/fast to find the middle — same kernel, different goal)
LC 876 — Middle of Linked List (the simplest fast-slow use)

When NOT to use this: if you can read the list twice (then a counter + walk-N/2 finds the middle). If the cycle’s start matters and you don’t want the math (then HashSet to find the first repeat). If input is mutable AND interviewer permits (sentinel marking is fastest in wall-clock).

9. When to Move On (binary; must all be YES)

I wrote Floyd’s in <5 min on the second attempt with correct null guards
I can state aloud why the slow and fast pointers MUST meet inside any cycle (gap+1 mod C argument)
I can explain why speeds (1, 2) and not (1, 3) — even-length cycle hazard
I can implement the HashSet version and articulate why id() matters over val
My stress_test() covers both no-cycle and cycle inputs and passes 1000 iterations
I solved LC 142 (Cycle II) and can derive the meeting-point math
I solved LC 202 (Happy Number) and saw Floyd’s apply to integer iteration

If any unchecked: redo before moving to p09.

10. Company Context

Google (loves the proof — Staff-bar problem)

The framing: Standard problem, then “prove it terminates.”
Misleading example: A no-cycle list that’s long enough to be uncomfortable but not infinite. Some candidates assume “long = cycle” and false-positive.
Adversarial extension: “Now find the START of the cycle.” (LC 142.) Then “What if the cycle exists but the head is NOT inside it?” (e.g., 1→2→3→4→3, cycle starts at 3 — head is not in cycle.)
What they want to hear: Modular arithmetic proof, not “I trust it.” Verbatim phrases like “the gap grows by 1 modulo cycle length each step, so it must reach 0 within C iterations” earn Staff signal.

Amazon (defensive coding)

The framing: Often in a story: “Detect if a workflow has a self-referential dependency.”
Misleading example: Single-node self-loop. Many naive Floyd’s implementations skip this because of how they initialize.
Adversarial extension: “What if the input could already be partially corrupted (cyclic) and we need to safely traverse just the acyclic prefix?” → returns the prefix length AND cycle detection.
What they want to hear: “Let me handle empty and single-node first.” Then proceed.

Microsoft (phone screen mainstay)

The framing: Often back-to-back with reverse and merge.
What they want to hear: Both HashSet and Floyd’s solutions, with explicit tradeoff statement: “HashSet is O(N) space but simpler; Floyd’s is O(1) space, which is the production preference.”

Meta (fast follow-ups)

The framing: Linked List Cycle → Cycle II → Happy Number, in rapid sequence. They use this to test whether you generalize.
What they want to hear: Naming the pattern: “this is Floyd’s cycle detection; it applies to any iterated function, not just linked lists. Happy Number is the same algorithm on integer→integer iteration.”

Bloomberg / Apple

The framing: Standard. Bloomberg sometimes adds: “If detected, return the cycle length too.”
What they want to hear: Awareness that once Floyd’s detects, you can count cycle length by walking one pointer until they re-meet.

11. Interviewer’s Lens

Phase	Strong signal	Weak signal	Scorecard phrase (strong)
Reading problem	Asks “self-loop allowed?” + “entire list a cycle?”	Assumes “normal” cycle	“Considers degenerate cases — production-mindful”
Pre-coding	Offers both HashSet and Floyd’s, picks Floyd’s for space, justifies	Goes straight to HashSet	“Knows the O(1) algorithm and chooses it for the right reason”
Coding (Floyd’s)	Guards `fast and fast.next` before advancing	Crashes on edge inputs	“Defensive pointer arithmetic — strong instinct”
Correctness justification	Articulates modular-arithmetic / pigeonhole argument	“It just works” or hand-waves	“Owns the algorithm’s correctness — Staff signal”
Follow-up	Generalizes to integer iteration (LC 202, 287)	Sees only the linked list version	“Recognizes the kernel across problem families”

The scorecard line that gets you the offer: “Implemented Floyd’s correctly with correct null guards; articulated termination proof using modular arithmetic; generalized to integer iteration.”

The scorecard line that loses you: “Used HashSet on values not identity; crashed on self-loop; could not explain why the algorithm works.”

12. Level Delta

Level	What their answer looks like
Mid	HashSet version with `id(node)`. Works. Can’t explain Floyd’s beyond “fast catches slow.” ~8 min.
Senior	Floyd’s with correct null guards. Handles empty, single-node, self-loop. Articulates “fast gains on slow by 1 per step, so within cycle length they must meet.” ~6 min.
Staff	All of Senior + names the pattern + connects to LC 142, 202, 287 + provides explicit modular-arithmetic correctness sketch + mentions why (1,2) and not (1,3). ~5 min.
Principal	All of Staff + discusses real applications (Pollard’s rho factorization, garbage collector cycle detection, distributed deadlock detection) + offers Brent’s algorithm as a Floyd’s variant with better constants + mentions that for very long acyclic prefixes Floyd’s wastes work and Brent’s is better in practice. ~4 min with full discussion.

13. Follow-up Questions & Full Answers

Follow-up 1: “Find the start of the cycle.” (LC 142)

Signal sought: Can you do the math?

Full answer: “After Floyd’s detects the meet, reset one pointer to head. Advance both by 1 step. They meet at the cycle start. Proof: let L = length of non-cycle prefix, C = cycle length, k = distance from cycle start to the meet point. When slow has moved L+k steps, fast has moved 2(L+k) and is at the same node, so 2(L+k) − (L+k) = L+k is a multiple of C: L+k ≡ 0 (mod C), so L ≡ −k ≡ C−k (mod C). That means walking L more steps from the meet point lands exactly at the cycle start. Reset one pointer to head, advance both by 1, they meet there. O(N) time, O(1) space.”

Follow-up 2: “What’s the largest input where HashSet would be cheaper than Floyd’s in wall-clock?”

Signal sought: Engineering pragmatism, not just asymptotics.

Full answer: “HashSet does ~1 hash + 1 compare per node, but allocates a hash table that grows. Floyd’s does 3 pointer-derefs per slow-step (slow advance + fast.next + fast.next.next). For tiny N (under ~100) and cycle absent, HashSet is sometimes marginally faster due to fewer pointer hops, but allocates memory. For any meaningful N or when allocations matter (real-time systems, embedded), Floyd’s wins. The space difference (O(N) vs O(1)) almost always dominates the decision.”

Follow-up 3: “What if speeds were (1, 3) instead of (1, 2)?”

Signal sought: Did you actually understand the proof?

Full answer: “The gap closes by 2 each step (modulo C). If C is odd, gcd(2, C) = 1 and the gap eventually hits 0. If C is even, the gap stays at the same parity forever — if it starts odd, it never reaches 0. So (1, 3) fails on even cycle lengths. The choice of (1, 2) is what guarantees gcd(speed difference, C) = 1 for every C ≥ 2.”

Follow-up 4: “Apply Floyd’s to find a duplicate in an array of N+1 integers, each in [1, N].” (LC 287)

Signal sought: Generalization to non-list inputs.

Full answer: “Treat the array as a function: f(i) = nums[i]. Start at i=0 (or N, any value never produced as output works as a sentinel; here 0 works because nums values are in [1,N]). Define next(x) = nums[x]. Since there’s a duplicate value d, two indices map to d, so the iteration x → nums[x] must form a cycle at d. Apply Floyd’s: slow advances by slow = nums[slow], fast by fast = nums[nums[fast]]. Detect meet, then find cycle start using the LC 142 trick. The cycle start IS the duplicate. O(N) time, O(1) extra space — beats sort and HashSet.”

Follow-up 5 (Senior signal): “How would you test cycle detection without modifying production data?”

Signal sought: Testing discipline.

Full answer: “Three layers. (1) Build a builder utility that constructs a list AND injects an optional cycle at index k — separate the test scaffolding from the algorithm. (2) Property tests: (a) any acyclic list returns False, (b) any cyclic list returns True, (c) cycle length 1 (self-loop) returns True, (d) entire list is the cycle returns True. (3) Memory profile: assert Floyd’s allocates zero new nodes (vs HashSet’s O(N)). My solution.py uses a build_with_cycle(values, cycle_at) helper precisely for layered testing.”

14. Full Solution Walkthrough

See solution.py.

The file has six sections:

ListNode — minimal class.
has_cycle_hashset(head) — the brute force using id() as key.
has_cycle(head) — Floyd’s, with INVARIANT comment. Guards fast and fast.next before advancing.
build_with_cycle(values, cycle_at) — helper to construct test inputs with controlled cycle start (or -1 for no cycle).
stress_test() — generates 1000 random cases, half with cycles, asserts both implementations agree.
edge_cases() — empty, single (no cycle), single self-loop, two-node cycle, entire-list cycle, cycle at tail.

Decisions justified:

Why id(node) not node: explicit identity contract; works in any language port.
Why slow = fast = head not fast = head.next: symmetric initialization is what makes the LC 142 math work.

15. Beyond the Problem — Production Reality

Real-world cycle detection:

Garbage collectors (Python’s reference-cycle collector, JVM’s old-generation tracer): cycle detection is critical for reclaiming memory. The algorithms are richer than Floyd’s (they use tri-color marking) but Floyd’s is the conceptual ancestor.
Distributed deadlock detection: in databases (e.g., Postgres, Spanner), each transaction is a node, “waits-for” is the edge. Cycle = deadlock. Detection algorithms run periodically over the wait-for graph.
Build systems: detecting circular dependencies in Bazel, Cargo, npm — same family of algorithm, lifted to DAGs.
Pollard’s rho factorization: Floyd’s cycle detection on the iteration x → (x² + c) mod n factors large integers. Same algorithm, profoundly different application.

Real systems this is the kernel of:

Linux kernel kref debugging: detecting object reference cycles.
TensorFlow / PyTorch dataflow graphs: cycle detection before compilation.
Service mesh dependency analysis: Istio uses cycle detection over service-call graphs to flag bad architecture.

Principal-engineer code review comment: “Floyd’s is correct but in our production we use Brent’s algorithm — it has the same O(N) bound but fewer pointer dereferences on long acyclic inputs. We saw a 30% speedup in our packet-trace analyzer. Worth knowing both. Also: please add metric emission so if cycle is detected in our workflow scheduler, we page on it — silent cycle handling has caused two outages.”

LeetCode Interview — Extreme Coding