p47 — Insert Interval
Source: LeetCode 57 · Medium · Topics: Array Companies (2024–2025 frequency): Google (very high), Meta (high), Amazon (high), Microsoft (medium-high), Bloomberg (medium), LinkedIn (medium) Loop position: Common Medium, frequently paired with p46 Merge Intervals as a two-problem set. Tests whether you exploit pre-sorted input or wastefully re-sort.
1. Quick Context
You’re given a list of non-overlapping, sorted-by-start intervals and a new interval. Insert the new interval and merge if necessary; return the resulting list.
The lazy approach: Append new to the list, re-sort, run Merge Intervals (p46). O(n log n). Works but ignores the pre-sortedness — wasted log-factor.
The right approach: Three-phase linear scan in O(n):
- Before: copy all intervals that END strictly before
new.start. - Merge: while the current interval OVERLAPS new, expand new to
[min(starts), max(ends)]and skip; finally append the expanded new. - After: copy all remaining intervals.
out = []
i, n = 0, len(intervals)
# Phase 1: strictly before
while i < n and intervals[i][1] < new[0]:
out.append(intervals[i]); i += 1
# Phase 2: merge overlapping
while i < n and intervals[i][0] <= new[1]:
new = [min(new[0], intervals[i][0]), max(new[1], intervals[i][1])]
i += 1
out.append(new)
# Phase 3: strictly after
out.extend(intervals[i:])
return out
O(n) time, O(n) output. The senior signal is recognizing that re-sorting is wasteful.
2. LeetCode Link + Attempt Gate
🔗 https://leetcode.com/problems/insert-interval/
STOP. 15-min timer. Draw the three phases on paper before coding.
3. Prerequisite Concepts
- p46 Merge Intervals (the sub-procedure used by the lazy approach).
- Linear-scan iteration with index variable.
- Closed-interval overlap check
a.start ≤ b.end AND b.start ≤ a.end.
4. How to Approach (FRAMEWORK Steps 1–9)
Step 1 — Restate: “Given disjoint sorted intervals and one new interval, insert + merge to maintain disjoint-sorted invariant.”
Step 2 — Clarify:
- “Are existing intervals guaranteed sorted by start AND non-overlapping?” (LC: yes.)
- “Closed-interval merge?” (LC: yes —
[1, 2]and[2, 3]merge.) - “May new interval be empty (
start > end)?” (LC: no.) - “Can I mutate input?” (LC: yes.)
Step 3 — Constraints: n ≤ 10^4. Both O(n) and O(n log n) work. The interview signals which one you reach for.
Step 4 — Examples:
intervals = [[1,3],[6,9]], new = [2,5] → [[1,5],[6,9]].intervals = [[1,2],[3,5],[6,7],[8,10],[12,16]], new = [4,8] → [[1,2],[3,10],[12,16]].intervals = [], new = [5,7] → [[5,7]].intervals = [[1,5]], new = [2,3] → [[1,5]](contained).intervals = [[1,5]], new = [6,8] → [[1,5],[6,8]](after).
Step 5 — Brute force: Append + re-sort + run merge. O(n log n).
Step 6 — Complexity:
- Lazy (append + sort + merge): O(n log n).
- Three-phase scan: O(n).
- Binary search to find phase boundaries: O(log n) for boundaries + O(k) merge where k = #merged. Worst-case O(n) but typically faster.
Step 7 — Pattern: Pre-sorted-input optimization. Whenever input is sorted, ASK if you can leverage that to drop a log factor.
Step 8 — Develop: see solution.py.
Step 9 — Test: empty list; new before all; new after all; new merges with one; new merges with many; new contained; new contains all.
5. Progressive Hints
If stuck for 5 min, open hints.md.
6. Deeper Insight — The three phases, formalized
6.1 Phase definitions
Let new = [ns, ne]. Partition the input intervals into three groups:
- B (Before):
i.end < ns— strictly before new; cannot merge. - M (Merge):
i.start ≤ ne AND i.end ≥ ns— overlaps new; must merge. - A (After):
i.start > ne— strictly after new; cannot merge.
After sorting by start (input is already sorted), these groups appear in order: B, then M, then A. The scan walks through them in sequence.
Critical detail: the boundaries depend on the overlap check. For closed intervals: i.end < ns means “strictly before”; i.end ≥ ns AND i.start ≤ ne means “overlaps.”
6.2 The merge update
While in phase M, the new interval EXPANDS:
new = [min(new[0], i[0]), max(new[1], i[1])]
Both min and max are required:
minbecause the merged interval starts at the EARLIEST of all merged starts (could be the original new or some intervals[i]).maxbecause the merged interval ends at the LATEST of all merged ends.
After phase M terminates, append new (now expanded) ONCE, then continue with phase A.
6.3 Why one-pass O(n) beats the O(n log n) lazy approach
The lazy approach treats the problem as unsorted: append, re-sort, run merge. The re-sort is n log n work to recover what we already knew. The three-phase scan exploits the existing order: each interval is visited exactly once, and we know its phase from a simple comparison.
For n = 10^4, both are fast. For n = 10^7 (in-memory database with timestamp ranges, say), the log-factor matters: 10^7 vs ~2.3 × 10^8 ops.
6.4 Binary search optimization
The boundary between B and M can be found by binary-searching for the first interval with end ≥ ns. Similarly, the boundary between M and A. Each is O(log n).
If only a few intervals merge (typical case in production), this gives O(log n) total work — a huge win when n is large and updates are dense.
import bisect
# left boundary: first interval whose end >= ns
b_end = bisect.bisect_left([i[1] for i in intervals], ns)
# right boundary: first interval whose start > ne
a_start = bisect.bisect_right([i[0] for i in intervals], ne)
In practice, building those projection lists costs O(n), so use a SortedList-of-ends or maintain dual indices. Mention this as the senior+ optimization.
6.5 Mutation vs immutability
The three-phase scan can build out as a fresh list (clean) or mutate intervals (savvier). Mutation:
- Splice out the M intervals, replace with the single expanded new.
- O(n) due to list shifts in Python; not really faster, just lower memory.
Production: prefer immutable returns. Frees the caller from aliasing concerns and pairs with type-safe interval types.
6.6 Half-open variant
For [start, end) intervals: change < to <= in phase 1 boundary (i.end <= ns means “strictly before”), and <= to < in phase 2 (i.start < ne means “overlaps”). One-liner difference, but ASK first.
6.7 Online insertion into a sorted-interval data structure
If you maintain a long-lived sorted set of intervals and insert is frequent, use a SortedList (or interval tree):
- Find insertion point with bisect: O(log n).
- Scan left and right for overlapping neighbors; merge.
- Remove subsumed; insert expanded new. Amortized O(log n + k) per insert. This is the production answer; LC asks for the batch one-shot.
7. Anti-Pattern Analysis
- Append + sort + run Merge Intervals — works, but wastes the pre-sortedness; senior interviewers downgrade.
- Skip the
mininnew[0] = ...— if new started LATER than some merged interval, you’d shrink the merged interval’s start. Wrong. - Forget
max— same issue from p46. - Forget to append
newbefore phase A — drops the merged interval entirely. - Loop with
foroverintervalsandbreakinstead ofwhilewith index — phase A becomes awkward (need to remember where you stopped). - Treat overlap as
i.end <= nsstrict-less-than when problem is closed — misses the touch-merge case. - Re-check the same interval against
newafternewhas expanded — already handled by thewhileadvancingi; only a bug if you forget to advance. - Use
extend([new])onout—append(new)is cleaner. - Returning the input list with
intervals.insert(...)— O(n) per insert AND breaks the merge logic for multi-overlap. - Computing
newlazily then forgetting to push — the “phase 2 exit point” is the most common bug site.
8. Skills & Takeaways
- Three-phase scan as a pattern for “insert into sorted disjoint structure.”
- Pre-sorted-input optimization — drop log factor by exploiting order.
min(starts), max(ends)as the universal merge formula.- Closed vs half-open boundary semantics.
- Binary search to locate phase boundaries (O(log n) senior-level optimization).
- SortedList / interval tree as the production data structure for online inserts.
9. When to Move On
Move on when:
- You write the three-phase scan in ≤ 10 min from memory.
- You explain WHY both
minandmaxare needed in the merge update. - You ASK about endpoint semantics (closed vs half-open).
- You can sketch the binary-search optimization in 2 min.
- Stress test on 1000 random inputs agrees with the append+sort+merge oracle.
10. Company Context
Google:
- L4/L5 onsite Medium; usually paired with p46.
- Bar Raiser: “Exploited pre-sorted input for O(n); didn’t re-sort.”
Meta:
- L5+ phone screen. The three-phase scan must be on-the-fly correct.
Amazon:
- “Insert a new shift into a driver’s existing schedule” — same problem, productized phrasing.
Microsoft:
- L62+. Variant: “Now do it 1000 times — maintain the structure across many inserts.” (Online variant.)
Bloomberg:
- “Add a new trading-hour exception to an existing market hours list.”
LinkedIn:
- Calendar conflict detection.
11. Interviewer’s Lens
| Signal | Junior | Mid | Senior | Staff/Principal |
|---|---|---|---|---|
| First instinct | Append + sort + merge | Three-phase scan | + explains why pre-sortedness matters | + mentions binary-search boundaries |
| Merge update | Just max (BUG) | min AND max | + explains both | + immutable rebuild for safety |
| Edge: empty input | Crashes | Handles | + explicitly tests | + parameterizes for Optional |
| Edge: new before all / after all | Forgets phase A | Handles | + names the three phases | + symmetric reasoning |
| Production | Silent | Silent | “SortedList for online” | + interval tree, segment tree, persistent data structures |
12. Level Delta
Mid (L4):
- Three-phase scan working.
minandmaxin merge update.- Handles empty input.
Senior (L5):
- ASKs endpoint semantics.
- Articulates O(n) vs O(n log n) trade.
- Mentions binary-search boundaries as further optimization.
Staff (L6):
- Discusses SortedList / interval tree for online inserts.
- Discusses 2D extension (rectangle insert) — much harder.
- Discusses immutable persistent interval structures for transactional systems.
Principal (L7+):
- Discusses persistent / functional interval trees for time-travel queries.
- Discusses CRDT-style merge for distributed calendar systems.
- Discusses online + concurrent insert with lock-free data structures.
- Discusses bulk insert when many intervals are inserted at once: sort batch first, then sweep through the existing structure once.
13. Follow-up Q&A
Q1: What if many inserts in sequence — maintain state across calls? A: SortedList keyed by start. Per insert: bisect to find left neighbor; scan right while overlapping; remove + replace. Amortized O(log n + k) per insert, k = #merged.
Q2: Insert BULK (many new intervals at once)? A: Sort the new batch; do a merged sweep through (existing ∪ new) using two-pointer; output the union as a fresh merged list. O((n + m) log m) where m = batch size, or O(n + m) if both are pre-sorted.
Q3: Half-open intervals?
A: Switch the overlap check from <= to < at the phase boundaries.
Q4: Insert with a CAPACITY constraint (each interval covers up to k items)? A: Becomes a sweep-line accumulation problem. Harder; not a simple insert anymore.
Q5: Insert into a DISK-resident sorted interval store? A: B-tree or LSM-tree of intervals; bulk-merge strategy similar to Q2. Production: LevelDB/RocksDB with interval keys.
Q6: Delete an interval — symmetric problem? A: Find intervals overlapping the target via bisect; for each, either remove (if fully contained) or split into pieces (if it spans the boundary). O(log n + k).
14. Solution Walkthrough
See solution.py:
insert_interval— three-phase O(n) scan canonical answer.insert_interval_lazy— append + sort + merge oracle.insert_interval_binary— binary-search boundary optimization.edge_cases()+stress_test()— 1000 random inputs; all three agree.
Run: python3 solution.py.
15. Beyond the Problem
Principal-engineer code review comment:
“Three things. (1) The three-phase scan reads beautifully because the phases are NAMED in comments. Make the comments
# Phase 1: before,# Phase 2: merge,# Phase 3: after. Future-you and reviewers immediately know what each loop does. (2) For our calendar-conflict service, we use a SortedList of(start, end)tuples with online inserts; this batch problem is a unit test against a brute oracle, not production. Don’t conflate the two. (3) If the input intervals are immutable references (shared with other code), the lazy approach mutates them via re-sort. Prefer the three-phase scan precisely because it never sorts the input.”
Connections forward:
- p46 — Merge Intervals (subroutine for the lazy approach).
- p48 — Non-overlapping Intervals.
- p49 — Meeting Rooms II.
- p50 — Employee Free Time.
- LC 715 — Range Module (online interval add/remove/query).
- LC 731 — My Calendar II (double-booking detection).
- LC 732 — My Calendar III (max active overlap).
- Production: calendar systems, scheduling, CRDT-based collaborative editing of time ranges.