Lab 05 — Monotonic Stack: Largest Rectangle In Histogram

Goal

Master the monotonic-stack pattern on its hardest canonical problem. Deliverable solves LC 84 in O(N) time, O(N) space, and you can articulate why each bar is pushed and popped at most once, why a sentinel 0 at the end is required (or how to handle the leftover stack), and how this generalizes to maximal-rectangle-of-1s in a 2D grid.

Background Concepts

Monotonic stack invariant; index-vs-value storage in stacks; sentinel technique for clean termination; amortized analysis (each index pushed and popped once); the rectangle’s “left boundary = new top after pop” / “right boundary = current index” trick. Review pattern Monotonic Stack.

Interview Context

LC 84 is one of the hardest commonly-asked Mediums (often labeled Hard). It appears at Google, Meta, and quant firms. The interview signal is whether you can derive the algorithm from a smaller cousin (LC 496 — Next Greater Element). Naive candidates write O(N²) “for each bar, expand left and right”. Decent candidates derive a left-bounds array and right-bounds array via two monotonic stack passes. Strong candidates do it in one pass with the popped-bar’s-rectangle trick. Elite candidates immediately observe that LC 85 (Maximal Rectangle) is just LC 84 applied per row of a derived heights array.

Problem Statement

Given an array heights representing histogram bar heights of equal width 1, return the area of the largest rectangle that fits within the histogram.

Constraints

1 ≤ N ≤ 10^5
0 ≤ heights[i] ≤ 10^4

Clarifying Questions

Are heights non-negative? (Per constraints, yes.)
Can heights be zero? (Yes — and zero-height bars effectively reset the candidates, since no rectangle can include them.)
Are bars unit-width? (Yes — width 1 each, so the rectangle’s width is just the number of consecutive bars that all have at least the rectangle’s height.)
Multiple equal-area rectangles — return any area, or specify? (Just the area; LC asks for the max.)
Is the answer always achievable in 32-bit? (max_height × N = 10^4 × 10^5 = 10^9, fits 32-bit signed but barely. Use 64-bit for safety in C++/Java.)

Examples

Input	Output
`[2,1,5,6,2,3]`	10 (rectangle of height 5 covering indices 2..3 — width 2 — wait, height 5 × width 2 = 10? actually height 5 spans indices 2..3 (heights 5,6), so width 2 rectangle of height 5 — area 10. Or height 6 over index 3 alone, area 6. Best is 10.)
`[2,4]`	4
`[2,1,2]`	3 (rectangle of height 1 spanning all 3 bars)
`[6,7,5,2,4,5,9,3]`	16
`[0]`	0

Initial Brute Force

For each bar i, expand left and right while heights stay ≥ heights[i]; rectangle area is heights[i] × (right - left + 1).

def largest_rect_brute(heights):
    n = len(heights)
    best = 0
    for i in range(n):
        l = r = i
        while l > 0 and heights[l - 1] >= heights[i]: l -= 1
        while r < n - 1 and heights[r + 1] >= heights[i]: r += 1
        best = max(best, heights[i] * (r - l + 1))
    return best

Brute Force Complexity

Time O(N²) worst case (all-equal heights). Space O(1). At N=10⁵, ~10^10 ops — TLEs everywhere.

Optimization Path

Observation: for each bar i, we want the largest rectangle of at least height heights[i] that contains i. Width = (next-smaller-to-the-right) − (previous-smaller-to-the-left) − 1. If we know “previous smaller index” pl[i] and “next smaller index” pr[i] for every bar, area is heights[i] * (pr[i] - pl[i] - 1), computed in O(N) total.

Both pl and pr are computable by a single monotonic-stack pass each. Even better: a single pass with a stack-of-indices in strictly increasing height order. When we encounter a smaller bar, we pop the stack; for each popped index j, the current index i is its pr[j] and the new top of the stack (after the pop) is its pl[j]. Compute j’s rectangle area on the spot.

Final Expected Approach

def largest_rectangle_area(heights):
    stack = []  # indices, heights[stack] strictly increasing
    best = 0
    n = len(heights)
    for i in range(n + 1):
        cur = 0 if i == n else heights[i]
        while stack and heights[stack[-1]] > cur:
            top = stack.pop()
            left = stack[-1] if stack else -1
            width = i - left - 1
            best = max(best, heights[top] * width)
        stack.append(i)
    return best

The trick: iterate to n + 1 with a sentinel cur = 0. This forces every remaining bar to be popped (since 0 is strictly less than any positive height), so we don’t need a separate post-loop drain.

Data Structures Used

A stack of indices into heights, holding indices whose heights are strictly increasing from stack-bottom to stack-top.
A running best integer.

Correctness Argument

Stack invariant: at every point, heights[stack[0]] < heights[stack[1]] < ... < heights[stack[-1]].

Maintenance: before pushing i, we pop all indices j with heights[j] >= heights[i] (using > ensures strict; for this problem, > is correct and >= would over-pop). After popping, all remaining stack entries have height < heights[i], so pushing i preserves the invariant.

Per-popped-bar rectangle is correct: when we pop top = stack.pop(), by the invariant the new top’s height < heights[top], so pl[top] is stack[-1] (or -1 if stack empty). The current index i is the first index since top with heights[i] < heights[top] (because all indices between top+1 and i-1 had heights ≥ heights[top] and were either still on the stack or popped earlier — but if popped earlier, they were popped by a strictly-smaller bar, contradiction). So pr[top] = i. Width is i - pl[top] - 1.

Sentinel correctness: at i = n we use cur = 0, smaller than any positive height. This pops every remaining index, computing each one’s rectangle with pr = n.

Amortized O(N): each index pushed once, popped at most once. Inner while loop’s total iterations across the outer loop sum to ≤ N.

Complexity

Time O(N) amortized. Space O(N) for the stack worst case (strictly increasing input).

Implementation Requirements

Store indices, not heights, in the stack — you need indices to compute width.
Use the sentinel trick (i in range(n + 1) with cur = 0 at i = n) for clean code, OR drain the stack after the main loop with pr = n. Pick one; the sentinel is preferred.
Use strict > in the pop condition. For “largest rectangle”, > and >= give the same answer (rectangles with equal heights are accounted for by their leftmost bar), but >= over-pops and confuses the bookkeeping in cousin problems.
Use 64-bit arithmetic in Java/C++ for the area: 10^4 × 10^5 = 10^9, fits 32-bit, but Integer.MAX_VALUE = 2.1×10^9 and habits matter.

Tests

Smoke: [2,1,5,6,2,3] → 10.
Unit: [1] → 1, [1,1,1,1] → 4, [1,2,3,4,5] → 9 (rectangle of height 3 over indices 2..4), [5,4,3,2,1] → 9.
Edge: [0] → 0, [0,0,0] → 0, [N copies of 1] → N, [10000] → 10000.
Adversarial: strictly increasing — every bar stays on the stack until the sentinel; tests the drain. Strictly decreasing — every bar popped immediately; tests the per-bar bookkeeping.
Large: N = 10⁵, random heights; assert <100ms. Cross-check against brute on a 1000-prefix.
All same: [7,7,7,7,7,7,7,7] → 56.
Random: 100 random inputs of size ≤ 200 against brute.

Follow-up Questions

“Maximal rectangle of 1s in a 2D matrix (LC 85)?” → for each row r, build heights[c] = (heights[c] + 1 if mat[r][c] == 1 else 0). Then run LC 84 on each row’s heights. Total O(R·C).
“Largest rectangle of equal value?” → modify the predicate; same skeleton.
“Number of submatrices with all 1s (LC 1504)?” → variant where for each row + column we count.
“What if heights can be updated?” → segment tree with merge function; Phase 3 territory.
“What if N is so large the stack doesn’t fit?” → the stack is bounded by N; if N doesn’t fit in memory, you have a bigger problem. (Answer: stream-based algorithms with reduced memory exist for some restricted versions.)
“Why does >= give the same answer here?” → bars of equal height to the popped bar fail to extend its rectangle leftward (they’d be popped first or become the new left boundary), so the answer is unchanged. Subtle, worth a sentence in the interview.

Product Extension

In ad-hoc analytics, the “largest rectangle” maps to “the longest time window during which all of K monitored systems exceeded a threshold” — useful for SLA breach detection. Each system’s per-time-bin status forms a histogram; the largest rectangle is the worst sustained breach. The same single-pass stack algorithm processes a stream of metric snapshots in O(1) amortized per snapshot.

Language/Runtime Follow-ups

Python: native list as a stack — append / pop are O(1) amortized. Skip collections.deque here; the stack-only access pattern doesn’t benefit from a deque.
Java: prefer ArrayDeque<Integer> over Stack (the latter is a synchronized legacy class with overhead). Or use int[] with a manual top index — fastest for hot loops.
Go: slice-as-stack — stack = append(stack, i), top := stack[len(stack)-1]; stack = stack[:len(stack)-1]. Beware: capacity may grow geometrically and not shrink — fine here since N is bounded.
C++: std::vector<int> is fastest. std::stack<int> adds an unnecessary wrapper. Reserve capacity with stack.reserve(n).
JS/TS: native Array.prototype.push/pop — O(1) amortized. Not as fast as a typed Int32Array for hot loops.
Hot-loop: in Java, int[] + top int outperforms ArrayDeque<Integer> by ~3× due to no boxing.

Common Bugs

Off-by-one width: i - left - 1 vs i - left. The popped bar’s rectangle starts at left + 1 and ends at i - 1 (both inclusive), width = i - left - 1.
Forgetting the sentinel / drain. Without the sentinel 0, indices remaining on the stack are never processed. Their rectangles extend to n - 1 with pr = n.
Using >= instead of > (or vice versa). For LC 84, both happen to produce the right answer, but cousin problems break — pick the variant that gives a unique boundary.
Storing heights instead of indices. You then can’t compute width.
Integer overflow in C++/Java for max-bar × max-N. Use 64-bit.
Recursive simulation instead of iterative — Python’s default recursion limit is 1000, breaks at N > 1000.
Java boxing in Stack<Integer> or ArrayDeque<Integer> — silent slowdown. Use int[] with manual top.

Debugging Strategy

Trace [2,1,5,6,2,3]. Stack evolution: push 0; at i=1 (height 1), pop 0 (height 2, left=-1, width=1, area=2); push 1; push 2; push 3; at i=4 (height 2), pop 3 (height 6, left=2, width=1, area=6); pop 2 (height 5, left=1, width=2, area=10); push 4; push 5; sentinel pops everything. Best = 10. ✓
Trace [1,1,1,1]. With >, the stack just accumulates [0,1,2,3]; sentinel pops 3 (area 1), pops 2 (area 2), pops 1 (area 3), pops 0 (area 4). Best = 4. ✓
Cross-check 50 random inputs of size 50 against brute force.

Mastery Criteria

Recognized “largest rectangle in histogram” as a monotonic-stack problem within 60 seconds.
Stated the strict-monotonic-stack invariant before coding.
Used the sentinel-0 trick on first attempt (or correctly drained post-loop).
Wrote i - left - 1 correctly, no off-by-one.
Generalized to LC 85 (max rectangle of 1s) without prompting.
Articulated the amortized O(N) bound (each index pushed and popped once).
Solved a cousin problem (LC 42 trapping rain water with stack, or LC 901 stock span) in <10 minutes within a week.

LeetCode Interview — Extreme Coding