p45 — Reorganize String

Source: LeetCode 767 · Medium · Topics: Heap, Greedy, String, Counting Companies (2024–2025 frequency): Meta (high), Amazon (high), Google (medium-high), Microsoft (medium), Apple (medium), LinkedIn (medium) Loop position: Medium-Hard greedy + heap. Tests the “use top, stash, push back” delayed-reinsertion pattern that appears in task scheduling (LC 621), distance-K rearrangement (LC 1054), and load-balancing problems.

1. Quick Context

Given string s, rearrange its characters so no two adjacent characters are the same. Return any valid rearrangement, or "" if impossible.

Two canonical solutions:

  1. Greedy max-heap with delayed reinsertion. Pop the most-frequent char, append to result, STASH it (don’t push back yet); pop next most frequent, append, then push the stashed one back if still has count > 0. O(N log Σ).
  2. Counting-sort placement. Count frequencies; sort by count desc; place the most-frequent char at even indices first (0, 2, 4, ...), wrap to odd indices (1, 3, 5, ...). O(N + Σ log Σ). Often faster in practice and avoids the heap.

Feasibility check: answer exists iff max_count ≤ (N + 1) // 2. Otherwise return "" immediately.

The “stash, then push back” pattern is the key reusable trick — it prevents the same character from being chosen twice in a row, since the heap can’t see it until after the next pop.


🔗 https://leetcode.com/problems/reorganize-string/

STOP. 25-min timer. State the feasibility check first.


3. Prerequisite Concepts

  • Counter / collections.Counter.
  • heapq min-heap with negated counts for max-heap behavior.
  • Greedy intuition: always place the most constrained item first (the one with highest remaining count).
  • Pigeonhole-style feasibility argument.

4. How to Approach (FRAMEWORK Steps 1–9)

Step 1 — Restate: “Permute the characters of s so no two adjacent are equal; or report impossibility.”

Step 2 — Clarify:

  • “Lowercase ASCII only?” (LC: yes.)
  • “Length bound?” (LC: 1 ≤ N ≤ 500.)
  • “Any valid answer accepted, or canonical?” (Any valid.)
  • “Output empty string on impossibility?” (Yes per LC.)

Step 3 — Constraints: N ≤ 500 → any O(N log N) solution is comfortable. Σ = 26.

Step 4 — Examples:

  • "aab" → "aba".
  • "aaab" → "" (three as, length 4 → max_count=3 > (4+1)//2=2; impossible).
  • "aaabbc" → "ababac" (one valid output).

Step 5 — Brute force: Permute all N! arrangements; return first valid. Factorial; only viable for N ≤ 8. Useful as a small-input oracle.

Step 6 — Complexity:

  • Brute permutations: O(N! · N).
  • Heap greedy: O(N log Σ).
  • Counting-sort placement: O(N + Σ log Σ).

Step 7 — Pattern: “Greedy with delayed reinsertion.” Sibling problems: LC 621 (Task Scheduler with cooldown), LC 358 (Rearrange String K Distance Apart), LC 1054 (Distant Barcodes).

Step 8 — Develop: see solution.py.

Step 9 — Test: length 1, two distinct chars, max_count exceeds bound, all same char, alternating possible, tight feasibility (max_count == (n+1)//2).


5. Progressive Hints

If stuck for 5 min, open hints.md.


6. Deeper Insight — Two equally good answers

6.1 Feasibility — the pigeonhole argument

Claim: a valid rearrangement exists iff max_count ≤ (N + 1) // 2.

  • Necessity: Suppose char x has count c > (N+1)//2. Place x at positions p_1 < p_2 < ... < p_c. Between consecutive xs we need at least one non-x character. So we need c - 1 “separators”, which must come from the remaining N - c characters. Need N - c ≥ c - 1, i.e., c ≤ (N + 1) / 2.
  • Sufficiency: The counting-sort placement (Section 6.3) gives a constructive proof: place the most-frequent character at even indices 0, 2, 4, ...; it fills at most (N + 1) // 2 slots, which is exactly the budget; then place remaining chars at the rest. No collision because no character repeats at distance 1 unless max_count exceeds the bound.

Check this FIRST. Save heap work on impossible inputs.

6.2 Greedy max-heap with delayed reinsertion

cnt = Counter(s)
if max(cnt.values()) > (len(s) + 1) // 2: return ""

heap = [(-c, ch) for ch, c in cnt.items()]
heapq.heapify(heap)

result = []
prev_count, prev_char = 0, ''
while heap:
    c, ch = heapq.heappop(heap)   # c is negative
    result.append(ch)
    # Push the PREVIOUSLY-used char back now (it was stashed last iteration).
    if prev_count < 0:
        heapq.heappush(heap, (prev_count, prev_char))
    # Stash the just-used char for the NEXT iteration.
    prev_count, prev_char = c + 1, ch   # +1 because c is negative

return ''.join(result)

Why this works: By stashing (prev_count, prev_char) for one iteration, the character we just placed is invisible to the heap on the next pop — guaranteeing we pick a DIFFERENT character next. Once we’ve picked the different one, we put the stashed char back into the heap with its remaining count.

The greedy choice — always pop the highest-remaining-count character — is correct because:

  • If we ever delay placing the max-count char, we risk exhausting the “separators” before we run out of xs.
  • By contrast, always placing the max-count char (subject to the “not adjacent to previous” constraint) keeps the count balanced.

6.3 Counting-sort placement — no heap

n = len(s)
cnt = Counter(s)
if max(cnt.values()) > (n + 1) // 2: return ""

# Sort by (count desc, char asc) — sorting by count desc is what matters.
chars = sorted(cnt.keys(), key=lambda c: -cnt[c])

result = [''] * n
idx = 0
for ch in chars:
    for _ in range(cnt[ch]):
        result[idx] = ch
        idx += 2
        if idx >= n:
            idx = 1   # wrap to odd indices
return ''.join(result)

Why this works:

  • Even indices: 0, 2, 4, ... — total (n + 1) // 2 slots.
  • Place the highest-count char first; if max_count ≤ (n+1)//2, it fits entirely within even indices (or just barely overflows by 1 into odd indices — still fine).
  • After even indices fill, wrap to odd indices 1, 3, 5, ....
  • No collision: a character placed at even index i and at odd index i+1? Only happens if the SAME character wraps around — but that requires the character’s count exceeds (n+1)//2, which we’ve ruled out by the feasibility check.

This is often faster (no heap overhead) and simpler to explain.

6.4 Why “stash and re-push” is the canonical greedy

The bug-prone naïve version:

while heap:
    c, ch = pop()
    result.append(ch)
    if c + 1 < 0: push((c + 1, ch))

This puts the same character RIGHT BACK into the heap with one less count — so the next pop can pick it again if no other character has higher count. Result: adjacent duplicates → wrong.

The fix is the one-iteration delay (stash). Same trick appears in:

  • LC 621 Task Scheduler — multi-iteration delay equal to the cooldown.
  • LC 358 Rearrange String K Distance Apart — generalize to delay K-1.
  • LC 1054 Distant Barcodes — same problem, different framing.

6.5 When the heap loses

For N = 500 with Σ = 26: heap-of-26 operations are O(log 26) ≈ 4.7. Total ~500 · 4.7 ≈ 2350 heap ops. The counting-sort version does O(N) writes + O(Σ log Σ) sort. Both trivial.

For larger Σ (e.g., Unicode), counting-sort gets larger but heap stays O(log Σ). Cross-over depends on N and Σ; both are O(N log N) worst-case anyway. Mention both.

6.6 Connection to load balancing

This problem is a 1-D toy version of round-robin scheduling with non-uniform task arrival rates. Replace “characters” with “tasks per host”, “adjacent” with “consecutive on same core”. The “no two adjacent” constraint becomes “no two consecutive on same core” — same algorithm, same feasibility bound. Production load balancers (HAProxy weighted round-robin, Kubernetes pod placement) use related heuristics.


7. Anti-Pattern Analysis

  1. No feasibility check — heap runs forever (or returns junk).
  2. Push-back-immediately (no delay) — adjacent duplicates → wrong.
  3. Sort-by-char only ("aab""aab") — misses high-count constraint.
  4. Try permutations — factorial, TLE for N > 8.
  5. Use heapq as max-heap without negation — pops smallest first; greedy is broken.
  6. Greedy-pick second-highest when adjacent constraint applies without stash — same bug as (2).
  7. Counting-sort but place at consecutive indices (0, 1, 2, 3, ...) — only works if all distinct.
  8. Counting-sort but wrap to odd at wrong moment — off-by-one if idx == n vs idx > n.
  9. Returning result as list when problem wants string — small thing, easy to miss.
  10. Confusing “no two adjacent same” with “stable sort” — different goals.

8. Skills & Takeaways

  • Pigeonhole feasibility argument: max_count ≤ (N+1)//2.
  • Delayed reinsertion (stash) pattern in heap greedy.
  • Counting-sort placement alternative — even-indices-then-odd.
  • Greedy correctness: always handle the most-constrained element first.
  • Generalization: Task Scheduler (LC 621), K-distance (LC 358), Distant Barcodes (LC 1054).
  • Round-robin scheduling as the production analogue.

9. When to Move On

Move on when:

  • You write the feasibility check FIRST without prompting.
  • You implement heap-with-stash in ≤ 15 min.
  • You implement counting-sort placement in ≤ 15 min.
  • You can explain WHY the naïve push-back-immediately fails.
  • You can sketch LC 621 (cooldown delay) in 2 min.
  • Stress test on N ≤ 100 agrees with permutation brute force for small N + validates output for larger N.

10. Company Context

Meta:

  • Common Medium-Hard. L4/L5 expected; L6 asked to generalize to K-distance.
  • Bar Raiser scorecard: “Greedy — articulated feasibility argument; chose heap-with-stash or counting-sort; explained why naïve fails.”

Amazon:

  • L5/L6. Often paired with Task Scheduler as a two-problem set.

Google:

  • L4/L5. Often asks the counting-sort version specifically (no heap).

Microsoft:

  • L62/L63. Variant: print schedule respecting cooldowns.

Apple:

  • iOS Notifications team has asked the deduplication variant.

LinkedIn:

  • Common Medium; often paired with “now do K-distance” follow-up.

11. Interviewer’s Lens

SignalJuniorMidSeniorStaff/Principal
FeasibilityMisses checkAdds check after seeing bugStates check FIRST+ proves it
GreedyNaïve push-backRecognizes adjacent-duplicate bugImplements stash+ explains generalization to LC 621/358
AlternativeHeap onlyHeap onlyMentions counting-sort+ compares both, knows when each is faster
Output orderingHardcodedNotes “any valid”Tests randomized+ discusses canonical/stable variants
ProductionSilentSilent“Like round-robin scheduling”+ load balancing, k8s pod scheduling, weighted scheduling

12. Level Delta

Mid (L4):

  • Implements heap-with-stash or counting-sort.
  • Includes feasibility check.
  • Tests with "aab", "aaab".

Senior (L5):

  • Provides BOTH solutions.
  • Explains naïve-push-back failure clearly.
  • Sketches LC 621 generalization (cooldown = stash-delay).

Staff (L6):

  • Proves feasibility via pigeonhole.
  • Discusses connection to LC 358 (K-distance) and LC 1054.
  • Discusses round-robin scheduling analogue.

Principal (L7+):

  • Discusses weighted scheduling in production (HAProxy, Envoy).
  • Discusses multi-resource scheduling (CPU + memory + GPU) — feasibility is harder; LP relaxation.
  • Discusses fairness constraints and starvation prevention in long-running systems.
  • Discusses online variant: characters arrive in a stream; produce output online without knowing total counts.

13. Follow-up Q&A

Q1: K-distance variant (LC 358)? A: Generalize stash to a QUEUE of size K-1: pop top, append, enqueue to wait-queue. After K iterations, dequeue and push back to heap (still has count > 0). Feasibility: max_count ≤ ceil(n / K) ? — actually (max_count - 1) * K + (#chars with max_count) ≤ n.

Q2: Task Scheduler (LC 621) — same but compute schedule LENGTH, not the schedule? A: Formula approach: result = max(n, (max_count - 1) * (cooldown + 1) + chars_at_max_count). O(N + Σ). No heap needed.

Q3: Online variant — characters arrive in a stream? A: Maintain heap; emit output as soon as legal. Buffer last-emitted for one tick (the stash). If stream completes and remaining chars violate adjacency, fail. Harder if counts unknown in advance.

Q4: Lex-smallest valid output? A: Greedy doesn’t directly give lex-smallest. Counting-sort version with lex tie-break on equal counts comes close but isn’t strictly minimal. True lex-smallest requires backtracking or careful ordering.

Q5: Why is max_count ≤ (n+1)//2 the bound (not n/2)? A: When n is odd, we have (n+1)/2 even slots (e.g., n=5: slots 0,2,4 = 3) and only n/2 odd slots. We can place the high-count char at all even slots without adjacency. n/2 would be too tight.

Q6: What if input has multiple characters tied at max_count, all > (n+1)//2? A: Impossible. The check max(counts) > (n+1)//2 catches it because the SINGLE max already exceeds.


14. Solution Walkthrough

See solution.py:

  • reorganize_string_heap — max-heap (negated) with stash pattern.
  • reorganize_string_indexed — counting-sort placement (even-then-odd).
  • reorganize_string_brute — permutation brute force (N ≤ 8 only).
  • is_valid(s, original) — checks no two adjacent same AND is a permutation of original.
  • edge_cases() + stress_test() — small inputs vs brute, large inputs validated by is_valid.

Run: python3 solution.py.


15. Beyond the Problem

Principal-engineer code review comment:

“Three thoughts. (1) Always do the feasibility check first — both as correctness and as a teaching artifact in the code; a senior should see that the math is the algorithm’s BACKBONE, not an afterthought. (2) For our scheduling code we use the counting-sort version because we want determinism (same input → same output for replay debugging); the heap version’s tie-break depends on heapq internals and isn’t stable. Prefer the deterministic one for production. (3) The K-distance generalization (LC 358) is the version that ACTUALLY shows up in our task scheduler — the cooldown is K. Recognize that this Medium is the warm-up for the Hard.”

Connections forward:

  • p41 — Kth Largest (heap intro).
  • p42 — Top K Frequent.
  • LC 621 — Task Scheduler (cooldown variant).
  • LC 358 — Rearrange String K Distance Apart (heap + wait queue).
  • LC 1054 — Distant Barcodes (same problem, ints).
  • LC 451 — Sort Characters by Frequency (counting-sort sibling).
  • Phase 02 Lab 09 — Heap Top-K.
  • Phase 06 — Greedy.
  • Production: round-robin scheduling, weighted load balancing, K8s pod scheduling, deduplication windows.