p53 — Next Greater Element II

Source: LeetCode 503 · Medium · Topics: Array, Stack, Monotonic Stack Companies (2024–2025 frequency): Amazon (high), Meta (medium-high), Google (medium), Bloomberg (medium) Loop position: The “circular” variant of p52. Tests whether you can extend the template without reinventing it. Off-by-one bugs on the doubling are common.

1. Quick Context

A circular array nums. For each nums[i], find the next strictly-greater element going clockwise (wrapping around at the end). Return -1 if no such element exists anywhere in the array.

Output is an array of the same length, by index.

The trick: simulate two passes by iterating 2n indices and accessing with i % n. Only push real indices on the FIRST pass; the second pass exists solely to resolve items still on the stack.

n = len(nums)
ans = [-1] * n
stack = []   # indices, values STRICTLY DECREASING
for i in range(2 * n):
    v = nums[i % n]
    while stack and nums[stack[-1]] < v:
        ans[stack.pop()] = v
    if i < n:
        stack.append(i)
# anything left in stack truly has no greater anywhere; ans[*] = -1 already

INVARIANT: same as p51 (indices with strictly decreasing values). The doubling guarantees every element gets a chance to be resolved by any larger element in the array.

2. LeetCode Link + Attempt Gate

🔗 https://leetcode.com/problems/next-greater-element-ii/

STOP. 12-min timer.

3. Prerequisite Concepts

Monotonic stack template (p51, p52).
Modular arithmetic for circular array access.
The insight: “wraparound = double the array virtually.”

4. How to Approach (FRAMEWORK Steps 1–9)

Step 1 — Restate: Circular nums. For each i, find the nearest j (going clockwise) with nums[j] > nums[i]; if none anywhere, -1.

Step 2 — Clarify:

“Strict greater?” (Yes.)
“Are values distinct?” (Not necessarily — duplicates allowed.)
“Return values or indices?” (LC: values.)
“Length range?” (n ≤ 10^4.)

Step 3 — Constraints: O(n) ideal.

Step 4 — Examples:

[1,2,1] → [2,-1,2] — the third 1 wraps around to find the 2.
[1,2,3,4,3] → [2,3,4,-1,4] — the last 3 wraps to find 4.
[5,4,3,2,1] → [-1,5,5,5,5] — only the global max has -1; everyone else wraps to find it.
[1,1,1] → [-1,-1,-1] — duplicates aren’t strictly greater.
[3,8,4,1,2] → [8,-1,8,2,3] — last two wrap.

Step 5 — Brute force: For each i, walk j = (i+1) % n, (i+2) % n, ... for up to n-1 steps. O(n²).

Step 6 — Complexity target: O(n).

Step 7 — Pattern: Monotonic stack + circular trick (iterate 2n, push only on pass 1).

Step 8 — Develop: see solution.py.

Step 9 — Test: duplicates, all-equal, strictly descending, strictly ascending, single, the LC examples.

5. Progressive Hints

If stuck for 5 min, open hints.md.

6. Deeper Insight — Why “iterate 2n, push only first n”?

6.1 Why two passes suffice

For any index i, the next greater element (if it exists) is somewhere in the window [i+1, i+1+n-1] in the doubled virtual array of length 2n. Iterating up to 2n - 1 ensures the stack has seen i and then seen every potentially-resolving position.

After 2n iterations, anything STILL on the stack genuinely has no greater element in the entire array, so -1 is correct.

6.2 Why “push only on first pass”?

If we pushed indices on the second pass too, we’d duplicate entries — index 5 would appear twice (once at iteration 5, once at 5+n). The pop logic would resolve them inconsistently.

The fix: pushing only when i < n ensures each index appears at most once on the stack, while the values read (via i % n) cycle through the full array a second time, giving every previously-pushed index a chance to be resolved.

6.3 Equivalent: stack of values only

Some implementations store values on the stack (since LC asks for values). But for circular handling and to support “track the resolving index” follow-ups, INDICES are more flexible. Practice indices.

6.4 What if the array isn’t circular?

Drop the doubling — back to p52’s logic. The two problems differ by exactly one change: iterate 2n instead of n and gate pushes behind i < n.

6.5 Connection to “longest distance to greater”

If instead of the next-greater value you wanted the distance, change ans[stack.pop()] = v to ans[j] = (i - j) % n (where j = stack.pop()). For circular distance, careful modular arithmetic is needed.

6.6 Why doubling = O(n) and not O(2n)?

Asymptotically identical. Each of the n indices is pushed at most once and popped at most once → total amortized work over 2n iterations = O(n). The factor of 2 is constant.

6.7 Why not triple? Why not `n²`?

After two passes, no further pops are possible: anything still on the stack has had 2n - 1 chances and been resisted by every position in the array. So a third pass would do zero useful work.

7. Anti-Pattern Analysis

Forget i < n gate on push → indices appear twice on the stack → wrong answers / infinite loop.
Iterate n instead of 2n → misses wraparound; degrades to p52 logic on circular input.
Use <= instead of < in pop → duplicates incorrectly resolve as if larger (LC asks STRICT greater).
Push values not indices, then try to index back into nums — needs a value→position map and breaks with duplicates.
Iterate 3n or more — no benefit, slight slowdown.
Reset stack between passes — destroys information; defeats the algorithm.
Try to do it with bisect on a sorted view — wrong tool; sorting destroys the “next clockwise” semantics.
Convert nums into nums + nums and run p52 — works but uses 2× memory; the modular trick is cleaner.

8. Skills & Takeaways

The doubling-for-circular trick generalizes to many circular-array problems.
Why push only on first pass is critical.
Index-on-stack vs. value-on-stack trade-off (favor index).
Same template as p51 / p52, just range(2n) and i % n.

9. When to Move On

You can write it cold in < 8 min.
You can explain the i < n push gate and why omitting it breaks the algorithm.
You can derive distance variant from the value variant.
Stress test on 1000 random circular arrays (with duplicates) agrees with brute O(n²).

10. Company Context

Amazon:

L5 onsite. Usually paired with LC 496 as the “harder follow-up.”
Bar Raiser: “Generalized the template cleanly; named the doubling trick; avoided memory-doubling alternative.”

Meta:

L5. Variant: “what if you can query at any index repeatedly?” → precompute the answer array once, O(1) per query.

Google:

L5. Often combined with stream processing.

Bloomberg:

Trading: “for each tick in a circular history buffer, find next higher tick.”

11. Interviewer’s Lens

Signal	Junior	Mid	Senior	Staff/Principal
Recognizes p52 connection	No	“Like that one problem”	“Same template, circular”	+ names the `2n` trick
Push gate	Forgets	Adds after debugging	Anticipates	+ explains why omitting breaks it
Memory-doubling alternative	Uses it	Mentions, dismisses	Avoids; explains modular trick	+ benchmarks both
Distance variant	Stuck	Derives slowly	Derives quickly	+ handles negative distances correctly

12. Level Delta

Mid (L4):

Iterates 2n, gates push, correct on duplicates and wraparound.

Senior (L5):

Articulates the doubling trick and the push gate clearly.
Connects to p51 / p52 explicitly.
Mentions distance variant.

Staff (L6):

Discusses precompute-once pattern for repeated queries.
Discusses how to extend to “next greater within window of size k” (sliding window on circular).
Recognizes the technique as a special case of “treat circular as 2× linear with constraints.”

Principal (L7+):

Discusses persistent monotonic stacks for “what was the answer for index i at simulation time t?”
Discusses streaming circular buffers (telemetry rings) and how to maintain next-greater under push/pop.
Discusses cache behavior: 2× iteration with modular access can hurt prefetcher; alternative is nums + nums flat copy (better locality, 2× memory). Real trade-off in tight loops.

13. Follow-up Q&A

Q1: What if I wanted the actual index, not the value? A1: Store indices on the stack (we already do). Change ans[j] = nums[i % n] to ans[j] = i % n.

Q2: What about the next SMALLER element circularly? A2: Flip the pop condition from < to > (stack becomes monotonically INCREASING).

Q3: What if duplicates count as “greater-or-equal”? A3: Change pop condition to <=. Be careful: now the stack invariant is “non-increasing” (allows equals), and an arrival of equal value resolves the earlier equals.

Q4: Can we avoid the i < n gate by some other trick? A4: Yes — break out of the loop when the stack is empty AND i >= n (no more resolutions possible). But the gate is simpler.

Q5: What’s the worst-case stack size? A5: n, achieved by strictly decreasing input.

Q6: How would you extend to “find the K-th greater element circularly”? A6: Much harder. The monotonic stack collapses information; you’d need a different structure (e.g., persistent segment tree or merge-sort tree).

14. Solution Walkthrough

See solution.py. Three implementations:

next_greater_elements_doubled — iterate 2n, push only on first pass. O(n).
next_greater_elements_concat — concatenate nums + nums, run p52 logic, slice first n. O(n), 2× memory.
next_greater_elements_brute — O(n²) scan-around-the-ring oracle.

Stress test on 1000 random arrays with duplicates agrees.

15. Beyond the Problem

Principal-engineer code review comment:

“The modular i % n is conceptually clean but creates a branch the CPU branch-predictor handles fine but the prefetcher hates. In a hot loop processing millions of circular buffers (think: market-data tick history), the nums + nums concatenation version benchmarks 1.3–1.7× faster on x86 due to linear access. Don’t optimize prematurely, but know the trade-off exists. Always measure before deciding which ‘cleaner’ code wins.”

LeetCode Interview — Extreme Coding