Lab 03 — Task Scheduler With Cooldown

Goal

Master the frequency-greedy pattern: schedule a stream of tasks with a per-type cooldown, minimizing total CPU cycles. Derive the closed-form formula (maxFreq - 1) * (n + 1) + countMax, prove its optimality via an exchange argument, and recognize when the formula breaks (when actual task count exceeds the formula).

Background

LC 621 is the canonical “greedy + counting formula” problem. The exchange argument is short but subtle: the most-frequent task type must be scheduled in the densest possible pattern (every n+1 slots), and any optimal schedule that does not schedule the most-frequent type maximally densely can be modified to do so without increasing the total cycle count. This produces the formula. The skill being tested is recognition of the pattern, derivation of the formula from first principles, and the discipline to also show the alternative max-heap simulation that handles the same problem operationally.

Interview Context

Asked at Amazon, Google, Meta. The interviewer is testing whether you can derive a formula or whether you reach for a heap reflexively. Both solutions are accepted; the formula version with proof is the stronger signal. Watch for the follow-up: “What if a new task type can arrive mid-execution?” — that breaks the formula and forces the heap.

Problem Statement

Given an array of tasks (uppercase letters) and an integer n (cooldown), schedule the tasks so that the same task type is separated by at least n other slots (the slots can be idle if necessary). Return the minimum number of CPU cycles to finish all tasks.

LeetCode reference: LC 621 — Task Scheduler.

Constraints

  • 1 ≤ |tasks| ≤ 10^4
  • tasks[i] is uppercase English letter (so at most 26 distinct types).
  • 0 ≤ n ≤ 100.

Clarifying Questions

  1. “Does the schedule have to be returned, or just the cycle count?” — count only.
  2. “Can multiple tasks of different types execute in the same cycle?” — no, one task per cycle (or one idle).
  3. “If n == 0, can same-type tasks run back to back?” — yes; answer is just len(tasks).
  4. “Are tasks pre-sorted or arrive in any order?” — order doesn’t matter; only the frequency vector matters.
  5. “Can a single task type appear more times than total cycles?” — no; the formula handles this naturally.

Examples

  • tasks = ["A","A","A","B","B","B"], n = 2 → 8. Schedule: A B _ A B _ A B.
  • tasks = ["A","A","A","B","B","B"], n = 0 → 6.
  • tasks = ["A","A","A","A","A","A","B","C","D","E","F","G"], n = 2 → 16. Formula: (6-1)*(2+1) + 1 = 16.
  • tasks = ["A","B","C","D","E","A","B","C","D","E"], n = 4 → 10. Formula gives (2-1)*5 + 5 = 10; actual = 10. Tight.
  • tasks = ["A","B","C","D","E","F"], n = 100 → 6. Formula gives (1-1)*101 + 6 = 6 since maxFreq = 1.

Initial Brute Force

Simulate. At each cycle, pick any task type with cooldown elapsed and remaining count > 0; if multiple, pick whichever (the simulation is correct under any tiebreaker, but optimal requires the highest-frequency one). If none available, idle. Repeat until all tasks done.

Brute Force Complexity

O(T) where T is the answer. Tight bound T ≤ |tasks| * (n + 1), so O(|tasks| * n) worst case. Acceptable but slower than the formula.

Optimization Path

  1. Brute simulation — easy to write, slow.
  2. Max-heap simulation — at each cycle, pop highest-count types, decrement, push back to a temporary “cooling” queue with cooldown timestamp. After n + 1 cycles or a complete pass, restore from cooling queue.
  3. Closed-form formula — derive from the structure of an optimal schedule. O(|tasks|) time, O(1) space.

Final Expected Approach

from collections import Counter

def least_interval(tasks, n):
    cnt = Counter(tasks)
    max_freq = max(cnt.values())
    count_max = sum(1 for v in cnt.values() if v == max_freq)
    return max(len(tasks), (max_freq - 1) * (n + 1) + count_max)

The max(len(tasks), …) handles the case where the formula gives less than total tasks — i.e., when there are so many distinct task types that we never need to idle.

Data Structures Used

  • A frequency counter (26-entry array or Counter).
  • Two integers: max_freq, count_max.

Correctness Argument

We prove the formula T = max(|tasks|, (max_freq - 1)(n + 1) + count_max) is optimal.

Setup. Let M = max_freq and K = count_max (number of types tied for the maximum frequency).

Lower bound (no schedule can do better than T). Consider any task type with frequency M. The M instances of this type must be separated by at least n other slots, so the schedule spans at least (M - 1)(n + 1) + 1 cycles for one such type. If K types are all tied at frequency M, then in cycles 1, n+2, 2n+3, … we must place an instance of each tied type — actually, we must place all K of them in one of the n+1-slot windows. The last window (after the last instance of the most-frequent type’s predecessor) has only the final instances of each tied type, contributing exactly K cycles after (M - 1)(n + 1). So T ≥ (M - 1)(n + 1) + K.

Also trivially T ≥ |tasks| (every task runs in its own cycle).

So T ≥ max(|tasks|, (M - 1)(n + 1) + K).

Upper bound (the formula’s value is achievable). We construct a schedule of length exactly (M - 1)(n + 1) + K (or |tasks| if larger) and show it is feasible.

  • Case 1: (M - 1)(n + 1) + K ≥ |tasks|. Lay out M − 1 complete frames of n + 1 slots each, followed by a final frame of K slots. In each frame, slot j (for 0 ≤ j < K) is reserved for the j-th most-frequent task type. The remaining n + 1 − K slots in each frame are filled by other task types in any order; if there aren’t enough non-cooldown candidates, idle. Exchange argument step: suppose an optimal schedule does not place the most-frequent task at slots 0, n+1, 2(n+1), … of consecutive frames. Then there is a slot i where it could have been placed but wasn’t. Swap it with whatever is there; cooldown is preserved (we are moving an instance to a slot that’s n + 1 away from the previous instance, which is within bounds); other tasks are not constrained by this swap. Iterate. The result is the formula schedule, with the same length.

  • Case 2: (M - 1)(n + 1) + K < |tasks|. Then there are more total tasks than the formula’s slot count; we have so much variety that no idle is needed. Schedule any feasible permutation; total cycles = |tasks|. This is achievable because at each cycle we have at least 26 distinct types to choose from (modulo cooldowns), and the cooldown constraint cannot exceed n slots, which is dominated by the diversity.

In both cases, the constructed schedule’s length matches the lower bound. Optimal. QED.

The exchange argument is the crucial step: it converts “the formula is one possible schedule’s length” into “no schedule is shorter.” Without the exchange, you have only an existence claim.

Complexity

  • Time O(|tasks|) — single pass to compute frequencies.
  • Space O(1) — at most 26 distinct task types, frequency dictionary is constant-bounded.

Implementation Requirements

  • Use Counter or a 26-entry array.
  • Compute max_freq and count_max in one pass.
  • Return max(len(tasks), (max_freq - 1) * (n + 1) + count_max) — do not skip the max.

Tests

def test_least_interval():
    assert least_interval(["A","A","A","B","B","B"], 2) == 8
    assert least_interval(["A","A","A","B","B","B"], 0) == 6
    assert least_interval(["A","A","A","A","A","A","B","C","D","E","F","G"], 2) == 16
    assert least_interval(["A","B","C","D","E","A","B","C","D","E"], 4) == 10
    assert least_interval(["A","B","C","D","E","F"], 100) == 6
    assert least_interval(["A"], 5) == 1
    assert least_interval(["A","A"], 0) == 2
    assert least_interval(["A","A","A","A"], 3) == 13  # (4-1)*4 + 1 = 13

Follow-up Questions

  • Streaming tasks. New tasks arrive mid-execution; formula no longer applies because frequencies change. Use the max-heap simulation.
  • Different cooldowns per task type. Heap with per-type cooldown tracker.
  • Print the actual schedule. Heap simulation produces a schedule; the formula does not directly give one (you’d reconstruct from the proof).
  • What if n can be huge (n = 10^9)? Same formula; constant-time arithmetic.

Product Extension

OS task scheduling with cooldown (e.g., a process that touched a hot resource must wait n cycles before re-touching). API rate limiting at the user level. Workout-program scheduling with muscle-group recovery windows.

Language / Runtime Follow-ups

  • Python: Counter(tasks) is O(|tasks|).
  • Java: int[26] array indexed by c - 'A'. Faster than HashMap<Character, Integer> for the constant-alphabet case.
  • Go: [26]int works the same.
  • C++: std::array<int, 26>; std::max_element for max_freq.
  • JS/TS: Map<string, number> or 26-entry array; the latter is faster.

Common Bugs

  • Forgetting max(len(tasks), formula) — fails on inputs with many distinct task types.
  • Using count_max = 0 when there’s only one max-frequency type (should be 1).
  • Off-by-one in the formula: (M - 1) * (n + 1) not M * (n + 1).
  • Heap simulation: forgetting to push the cooled-down task back, or pushing it back at the wrong cycle.

Debugging Strategy

  • For each test case, hand-compute M, K, and the formula. If formula matches expected, your code is wrong somewhere mechanical.
  • If formula doesn’t match expected, you have a conceptual error: either M is wrong (multi-counting) or K is wrong (counting non-tied types) or the max(|tasks|, …) clamp is missing.
  • Run brute simulation as a stress oracle for |tasks| ≤ 20.

Mastery Criteria

  • You can derive the formula from first principles in <3 minutes.
  • You can deliver the exchange argument out loud in <2 minutes.
  • You can write the formula-based solution in <3 minutes.
  • You can write the heap-based simulation in <10 minutes when asked for the streaming variant.
  • You can articulate why the max(|tasks|, formula) clamp is necessary and which case it covers, in <60 seconds.

← Lab 02 — Jump Game II · Phase 6 README · Lab 04 — Gas Station →