p98 — Design Circular Queue

Source: LeetCode 622 · Medium · Topics: Design, Array, Ring Buffer Companies (2024–2025): Amazon (high), Microsoft (high), Bloomberg (high) Loop position: Foundation of every fixed-size queue in systems software — kernel pipes, audio buffers, lock-free MPMC queues, network packet rings.

1. Quick Context

Implement a fixed-capacity queue with O(1) enQueue, deQueue, Front, Rear, isEmpty, isFull. The “circular” part: when the underlying array’s logical end is reached, wrap to index 0.

State (canonical): arr: List[int], front: int (read index), size: int (count). Rear index = (front + size - 1) % k. Next-write = (front + size) % k.

Alternative: front + rear pointers and a size-disambiguator (or sacrifice one slot to distinguish full from empty).

🔗 https://leetcode.com/problems/design-circular-queue/

STOP. 20-min timer.

3. Prerequisite Concepts

  • Modular arithmetic on array indices.
  • “Distinguish full from empty” problem (two equal pointers).

4. How to Approach (FRAMEWORK 1–9)

1. Restate: Fixed-size FIFO with O(1) ops on a circular array.

2. Clarify: What does enQueue return when full? False. What does Front() return when empty? -1.

3. Examples: k=3; enQ(1),enQ(2),enQ(3) → full; Front()=1, Rear()=3; deQ(); enQ(4) → wraps; Front()=2, Rear()=4.

5. Brute: Use collections.deque(maxlen=k). Works but defeats the point.

6. Target: O(1) every op.

7. Pattern: front + size representation avoids the “full vs empty” ambiguity cleanly.

8. Develop: see solution.py.

9. Test: wrap-around; fill-drain-fill; deQ on empty → False; enQ on full → False.

5. Progressive Hints

See hints.md.

6. Deeper Insight

6.1 The “full vs empty” problem

If you store only front and rear pointers (no size), then front == rear is ambiguous: empty OR full. Three resolutions:

  1. Keep a size counter (this solution).
  2. Sacrifice one slot: treat “full” as (rear + 1) % k == front. Wastes one slot.
  3. Use an extra “is_full” flag bit.

The size counter is cleanest in software; the “sacrifice one slot” trick is used in hardware (single-producer single-consumer rings where atomic update of two values is impossible).

6.2 Why O(1) matters

In systems software — audio buffers, kernel pipes, packet rings — the queue is on the hot path. An O(n) op (like list.pop(0) in Python, which shifts the entire array) blows the latency budget. Ring buffers give you O(1) FIFO with predictable cache behavior: writes go to a single hot cache line.

6.3 Lock-free MPMC (multi-producer multi-consumer)

Real production queues (Disruptor, boost::lockfree::queue, Linux’s kfifo) are lock-free ring buffers with atomic compare-and-swap on head/tail and memory barriers between writes. The data structure is the same; the concurrency control is the hard part.

6.4 Double-ended (deque) variant

LC 641 Design Circular Deque adds insertFront and deleteLast. Same ring buffer with both ends mutable.

6.5 Family: ring/queue design

  • LC 622 Design Circular Queue (this).
  • LC 641 Design Circular Deque.
  • LC 232 Implement Queue using Stacks (amortized O(1)).
  • LC 225 Implement Stack using Queues.

7. Anti-Pattern Analysis

  1. list.pop(0) — O(n); defeats the point.
  2. No wrap (use modulo) — IndexError after k operations.
  3. Track only front and rear, no size — full-vs-empty ambiguity.
  4. enQueue doesn’t return bool — LC requires bool return.
  5. Front() on empty returns arr[front] — wrong; must return -1.
  6. Use deque(maxlen=k) — works but trivializes the problem; interviewer wants the manual ring.

8. Skills & Takeaways

  • Ring buffer with front + size representation.
  • Full-vs-empty disambiguation.
  • O(1) FIFO with cache locality.

9. When to Move On

  • Implement cold in 15 min.
  • Articulate full-vs-empty resolutions.
  • Extend to deque (LC 641).

10. Company Context

  • Amazon: L5; “design with code” round staple.
  • Microsoft: L62.
  • Bloomberg: every Bloomberg systems-y phone screen.

Strong: “Ring buffer with front + size. Next-write index = (front + size) % k. Rear = (front + size - 1) % k. All ops O(1).”

11. Interviewer’s Lens

SignalJuniorMidSeniorStaff+
O(1) opsAfter hintColdColdCold
WrapForgetsFixesColdCold
Full vs emptyConfusedFixesCold+ cites variants
ConcurrencyNoneNoneMentions+ Disruptor, kfifo

12. Level Delta

  • Mid (L4): Works after 1 bug fix (off-by-one on rear index).
  • Senior (L5): Clean cold + handles all edge cases.
  • Staff (L6): + extends to deque + cites the three full-vs-empty resolutions + mentions lock-free variant.
  • Principal (L7+): + names production systems (LMAX Disruptor, Linux kfifo, boost lock-free) + discusses memory ordering / barriers / false sharing (padding to avoid head and tail living on the same cache line).

13. Follow-up Q&A

Q1: Thread-safe single-producer single-consumer. A1: Use atomic loads/stores on head and tail with appropriate memory ordering (acquire/release). No locks needed; this is the SPSC ring buffer.

Q2: Multi-producer multi-consumer. A2: CAS on head and tail with seqlock per slot, OR full Disruptor-style with sequence numbers. Significantly harder.

Q3: Make it growable. A3: Allocate new array of 2k; copy elements in logical order (front..front+size); reset front=0. Amortized O(1) like vector.

Q4: Cache-line padding. A4: Place front and rear on different cache lines (typically 64-byte padding between them) to avoid false sharing in concurrent code.

Q5: Why is deque(maxlen=k) insufficient as an answer? A5: The interviewer is testing your ability to design the data structure, not your knowledge of stdlib.

14. Solution Walkthrough

See solution.py:

  • MyCircularQueuefront + size ring buffer.
  • MyCircularQueueDequedeque(maxlen=k) brute (oracle).
  • Stress test driving random op sequences against both.

15. Beyond the Problem

Principal code review: “The circular queue is the atomic unit of high-performance messaging. Every system you depend on uses one: Linux kernel kfifo for inter-process pipes, ALSA for audio device buffers, NVMe submission/completion queues for SSDs, network drivers’ packet rings (e.g., DPDK, io_uring), and userspace concurrent queues (LMAX Disruptor at 100M+ ops/sec, used in financial exchanges). The leetcode problem is the single-threaded skeleton; production complexity is in three orthogonal axes: (1) concurrency (lock-free via atomic head/tail with memory barriers; or full Disruptor with per-slot sequence numbers), (2) cache behavior (pad head and tail to separate cache lines to eliminate false sharing — a single line of pad code can 10× throughput), (3) backpressure semantics (block? drop oldest? drop newest? return error?). The Staff+ recognition is that the data structure itself is a lever: choosing ‘enqueue overwrites oldest on full’ makes it a bounded log (Kafka tail); ‘enqueue blocks’ makes it a flow-control primitive (Go channels, ALSA); ‘enqueue fails’ makes it a back-pressure signal (web server connection queues). Same 30 lines of code; entirely different system semantics.”