Lab 01 — Interval Scheduling (Activity Selection)
Goal
Master the canonical greedy problem: maximum non-overlapping interval selection. Internalize the earliest-end-time-first greedy and produce its exchange argument out loud, without help, in under 90 seconds.
Background
Interval scheduling is the prototype greedy problem in every algorithms textbook because it has the cleanest exchange argument and the largest gap between “intuitive but wrong” choices (earliest start, shortest duration, fewest conflicts) and the one correct one (earliest end). Mastering this lab is mastering the discipline of proof-before-code for the rest of Phase 6. The same exchange-argument template recurs in LC 452 (minimum arrows to burst balloons), LC 1353 (maximum events attended), and dozens of variants.
Interview Context
A staff-level interviewer at FAANG-tier companies will not accept “I’ll sort by end time” without a justification. The signal they’re testing for is: can this candidate distinguish between intuition and proof? The exchange argument, delivered cleanly in 60–90 seconds before code, is the strongest possible signal. Conversely, candidates who code first and then mumble “earliest end is correct because… well… it just is” almost always fail the question even when their code passes the tests.
Problem Statement
Given n activities, each with a start time s_i and an end time e_i, select the maximum-cardinality subset of activities that are pairwise non-overlapping (an activity ending at time t does not overlap with one starting at time t — boundary touching is allowed). Return the size of that subset, or the subset itself if requested.
LeetCode reference: LC 435 — Non-overlapping Intervals (return the count of intervals to remove to make the rest non-overlapping; equivalent to n − maxNonOverlapping).
Constraints
1 ≤ n ≤ 10^5−5 · 10^4 ≤ s_i < e_i ≤ 5 · 10^4- Boundary contact (an interval ending at
tand another starting att) does NOT count as overlap. - Time complexity must be
O(n log n); spaceO(1)extra (sort in place).
Clarifying Questions
- “Are intervals open, closed, or half-open?” — LC convention is half-open:
[s, e)is a common interpretation; ask the interviewer. - “Is an interval where
s == e(zero-duration) allowed?” — usually yes, treat as a single-point interval. - “Should I return the maximum non-overlapping count or the minimum to remove?” — LC 435 asks the latter; the answer is
n − maxNonOverlapping. - “Can the intervals be unsorted? Are they ever pre-sorted?” — assume unsorted unless stated; sort is part of the algorithm.
- “Are there ties on end time?” — yes; tie-break by start time ascending so the algorithm is deterministic and the proof handles ties cleanly.
Examples
[[1,2], [2,3], [3,4], [1,3]]→ max non-overlapping = 3 (pick[1,2],[2,3],[3,4]); remove count = 1.[[1,2], [1,2], [1,2]]→ max non-overlapping = 1; remove count = 2.[[1,2], [2,3]]→ max non-overlapping = 2; remove count = 0 (boundary contact does not overlap).[]→ 0.
Initial Brute Force
Try every subset; for each, check pairwise non-overlap; return the size of the largest valid subset.
from itertools import combinations
def max_non_overlap_brute(intervals):
n = len(intervals)
best = 0
for r in range(n + 1):
for subset in combinations(intervals, r):
ok = all(subset[i][1] <= subset[j][0]
for i in range(len(subset))
for j in range(len(subset))
if subset[i][0] < subset[j][0])
if ok:
best = max(best, len(subset))
return best
Brute Force Complexity
Time O(2^n · n^2) — every subset checked pairwise. Space O(n) for combinations. Infeasible at n = 25. Useful only as a stress-test oracle for the greedy on n ≤ 12.
Optimization Path
- Brute force (above) — establishes correctness baseline.
- DP — sort by end time;
dp[i]= max non-overlapping using intervals0..i. Transition:dp[i] = max(dp[i-1], 1 + dp[predecessor(i)])wherepredecessor(i)is the latestj < iwithe_j ≤ s_i. TimeO(n log n)with binary search. SpaceO(n). This is weighted interval scheduling’s structure when we drop weights. - Greedy — sort by end time, scan once, pick whenever compatible.
O(n log n)time,O(1)extra space. Optimality from the exchange argument below.
The DP step is worth deriving even though greedy beats it, because the moment intervals carry weights, greedy fails and DP is the only correct approach. Keep DP in your back pocket.
Final Expected Approach
Sort intervals by end time ascending (tie-break by start ascending). Maintain last_end = -∞. For each interval [s, e] in sorted order, if s ≥ last_end, accept it (count += 1, last_end = e); else skip. Return count.
For LC 435 the answer is n − count.
Data Structures Used
- A sorted list / array of intervals.
- One scalar
last_end. - One counter
count.
No heap, no DSU, no DP table. The algorithm is sort-and-scan.
Correctness Argument
This is the section the rest of Phase 6 hangs on. We use the canonical 4-step exchange argument.
Setup. Let G = g_1, g_2, …, g_k be the greedy solution: intervals sorted by end time, picked greedily. Let O = o_1, o_2, …, o_m be any optimal solution, also written in sorted-by-end-time order (we can always re-sort an optimal solution; non-overlap is preserved). Assume for contradiction m > k (O is strictly better than G); we will derive a contradiction by showing we can transform O into G step by step without decreasing its size — implying m ≤ k.
Step 1 — Locate the first divergence. Let i be the smallest index where g_i ≠ o_i. By construction, g_1 = o_1, …, g_{i-1} = o_{i-1}.
Step 2 — Compare end times at the divergence. Greedy picks the interval with the earliest end time among those compatible with g_1, …, g_{i-1} — equivalently, those compatible with o_1, …, o_{i-1}. So g_i.end ≤ o_i.end (with equality possible if there is a tie and o_i happens to be the tied alternative).
Step 3 — Exchange. Replace o_i with g_i in O, producing O' = o_1, …, o_{i-1}, g_i, o_{i+1}, …, o_m. We must verify (a) feasibility and (b) size.
- Feasibility.
g_iis compatible witho_{i-1} = g_{i-1}because greedy ensured this.g_iis compatible witho_{i+1}:o_{i+1}.start ≥ o_i.end ≥ g_i.end, sog_iends no later thano_idid, and the rest ofOwas already non-overlapping witho_i’s end. SoO'is feasible. - Size.
|O'| = |O|(we replaced one interval with one interval).
Step 4 — Iterate. O' agrees with G on positions 1..i. Repeat the argument on O' versus G to find the next divergence. After at most min(k, m) exchanges, the resulting solution agrees with G on the first min(k, m) positions.
Conclude. If m > k, after k exchanges the transformed O looks like g_1, …, g_k, o_{k+1}, …, o_m. But greedy stopped at g_k, which means there was no interval compatible with g_1, …, g_k. Therefore o_{k+1} cannot exist — contradiction. So m ≤ k, i.e., G is at least as large as any feasible solution. G is optimal. QED.
Complexity
- Time
O(n log n)for the sort,O(n)for the scan; totalO(n log n). - Space
O(1)extra beyond the input (sort in place);O(log n)for sort recursion in some implementations.
Implementation Requirements
- Sort key:
(end, start)— explicit tie-break. - Boundary semantics:
s_next ≥ e_previs “compatible” (touching allowed). The interview’s clarifying question about open/closed determines this; default to half-open. - Handle empty input (
n = 0→ return 0). - Single pass, no nested loop.
Tests
def test_interval_scheduling():
# canonical
assert max_non_overlap([[1,2],[2,3],[3,4],[1,3]]) == 3
# all overlapping
assert max_non_overlap([[1,2],[1,2],[1,2]]) == 1
# no overlap
assert max_non_overlap([[1,2],[2,3]]) == 2
# empty
assert max_non_overlap([]) == 0
# single
assert max_non_overlap([[5,10]]) == 1
# tie on end time
assert max_non_overlap([[1,3],[2,3],[3,4]]) == 2 # one of {[1,3],[2,3]} + [3,4]
# negative coords
assert max_non_overlap([[-5,0],[0,5],[5,10]]) == 3
# nested
assert max_non_overlap([[1,10],[2,3],[4,5],[6,7]]) == 3
Stress test: generate random n ≤ 12 and compare greedy to brute force on 1000 trials.
Follow-up Questions
- Weighted version (each interval has weight
w_i; maximize total weight of chosen non-overlapping). Greedy fails. Solution: DP with binary search predecessor pointer,O(n log n). - Online / streaming version (intervals arrive one at a time, decide accept/reject immediately, no recall). Different problem class — competitive ratio analysis.
- K-machine extension (
kparallel resources; each interval scheduled on any one of them; maximize total scheduled). Greedy withk“last_end” trackers + min-heap.
Product Extension
Calendar systems (Google Calendar’s “find a meeting time” feature, AWS spot-instance scheduling, ad slot allocation): the unweighted version is the prototype, and weighted variants are exactly what production schedulers solve.
Language / Runtime Follow-ups
- Python:
sorted(intervals, key=lambda x: (x[1], x[0]))— stable sort, tuple comparison handles tie-break for free. - Java:
Arrays.sort(intervals, (a, b) -> a[1] != b[1] ? Integer.compare(a[1], b[1]) : Integer.compare(a[0], b[0])). UseInteger.compareto avoid integer-overflow ona[1] - b[1]. - Go:
sort.Slice(intervals, func(i, j int) bool { if intervals[i][1] != intervals[j][1] { return intervals[i][1] < intervals[j][1] }; return intervals[i][0] < intervals[j][0] }). - C++:
std::sortwith a lambda; preferstd::tie(a[1], a[0]) < std::tie(b[1], b[0])for clarity. - JS/TS:
intervals.sort((a, b) => a[1] - b[1] || a[0] - b[0])— the|| 0falls through to start-comparison on tie.
Common Bugs
- Sorting by start time instead of end time (intuitive, wrong).
- Wrong tie-break (e.g., descending start) breaking determinism in the proof.
- Using
>instead of≥for compatibility, rejecting touching intervals. - Forgetting to update
last_endafter accepting an interval. - Returning the count when the question asked for the removal count (LC 435), or vice versa.
Debugging Strategy
If your greedy disagrees with brute force on some n ≤ 12 input:
- Print both solutions side by side.
- Find the first interval where they differ.
- Manually run the exchange argument step on that point — does the swap preserve feasibility? If not, your sort key or tie-break is wrong.
- If the exchange preserves feasibility but your code didn’t pick
g_i, your scan logic has an off-by-one in the compatibility check.
Mastery Criteria
- You can write the algorithm in <5 minutes from a blank screen, including the tie-break in the sort key.
- You can deliver the 4-step exchange argument out loud in <90 seconds, without notes.
- You can extend to LC 452 (minimum arrows to burst balloons) by recognizing it as the same problem with renaming, in <2 minutes.
- You can articulate why the weighted version requires DP, in <30 seconds.
- You correctly reject the wrong sort keys (earliest start, shortest duration, fewest conflicts) by giving a concrete counterexample for each, in <2 minutes total.