Lab 06 — Advanced DP Optimization

Goal

Apply three classical DP optimization techniques — Convex Hull Trick (CHT), Knuth’s optimization, and Divide & Conquer DP — to reduce a polynomial-time DP from O(N²) or O(N³) to O(N log N) or O(N²).

Background

Many DPs have the form dp[i] = min_j (dp[j] + cost(j, i)) for various cost functions. The naive scan is O(N) per i, giving O(N²) total. Three techniques exploit structure in cost:

Convex Hull Trick: when cost(j, i) = a[j] * x[i] + b[j] (linear in x[i]), the transitions form lines; the min is the lower envelope, queryable in O(log N) or O(1) per query.
Knuth’s optimization: when the cost satisfies the quadrangle inequality (cost(a,c) + cost(b,d) ≤ cost(a,d) + cost(b,c) for a ≤ b ≤ c ≤ d), and optimal split points are monotonic. Reduces O(N³) to O(N²).
Divide & Conquer DP: when optimal split points are monotonic (opt(i, j) ≤ opt(i+1, j)). Reduces O(KN²) to O(KN log N).

Interview Context

Codeforces / ICPC: regular
Quant: heavy presence in trading-cost optimization, risk allocation
Compiler: loop scheduling sometimes uses CHT
Database: query optimizer cost minimization (rare)

Almost never in standard interviews.

When to Skip This Topic

Skip if any of these are true:

You aren’t already fluent in 1D and 2D DP (Phase 5 prerequisites)
You’re not targeting ICPC / quant / compiler optimization roles
You don’t have 2+ weeks to drill multiple variants

These are families of techniques; each requires several practice problems to internalize.

Problem Statement

Three variants, one for each technique:

Variant A — CHT (Convex Hull Trick)

You drive along a road with N houses. House i is at position x[i] (sorted). You can rent a car at house i for cost c[i] + d[i] * (x[j] - x[i]) if you drive from house i to house j > i. Starting at house 1, what’s the minimum cost to reach house N?

dp[j] = min over i < j of (dp[i] + c[i] + d[i] * (x[j] - x[i]))
      = min over i < j of (d[i] * x[j] + (dp[i] + c[i] - d[i] * x[i]))

This is linear in x[j] — CHT applicable. O(N) or O(N log N) depending on whether x is sorted.

Variant B — Knuth’s Optimization

Optimal Binary Search Tree. Given keys with access probabilities, build a BST minimizing expected access cost.

dp[i][j] = min over i ≤ k ≤ j of (dp[i][k-1] + dp[k+1][j]) + sum(p[i..j])

Naive O(N³). Knuth: O(N²) if opt[i][j] is monotonic, which holds when cost is quadrangle-inequality compliant.

Variant C — Divide & Conquer DP

Minimum K Partitions. Partition array a[1..N] into exactly K contiguous segments, minimizing sum of “cost” of each segment, where cost(l, r) satisfies the monotonic-opt property.

dp[k][i] = min over j < i of (dp[k-1][j] + cost(j+1, i))

Naive O(KN²). D&C DP: O(KN log N).

Constraints

A: 1 ≤ N ≤ 10^5
B: 1 ≤ N ≤ 5×10^3
C: 1 ≤ N ≤ 5×10^3, 1 ≤ K ≤ N

Clarifying Questions

A: Are x[i] strictly sorted? Are d[i] non-negative? B: Are probabilities normalized? Distinct keys? C: Is cost precomputable in O(1) after O(N²) prep? Quadrangle inequality verified?

Examples

A (CHT)

positions: [0, 5, 10, 20]
c = [0, 3, 2, _]; d = [1, 1, 2, _]
dp[1] = 0
dp[2] = 0 + 0 + 1*(5-0) = 5
dp[3] = min(0+0+1*10, 5+3+1*5) = 10
dp[4] = min(0+0+1*20, 5+3+1*15, 10+2+2*10) = 20 vs 23 vs 32 → 20

B (Knuth) — verify on a small probability vector.

C (D&C DP) — verify on contrived `cost`.

Brute Force

A: O(N²) DP scan. B: O(N³) standard. C: O(KN²) standard.

Brute Force Complexity

For N = 10^5 in A: O(N²) = 10^10 — TLE. For N = 5×10^3 in B/C: O(N³) = 1.25×10^11 — TLE.

Optimization Path

A (CHT):

Maintain a lower convex hull of lines y = d[i] * x + (dp[i] + c[i] - d[i] * x[i]). For each query x = x[j], find the line with minimum y at that x.

If x[j] is monotonic: use a “Li Chao tree” or a stack-based hull with pointer for O(1) amortized per query → O(N) total.
If x[j] arbitrary: binary search on hull → O(N log N).

B (Knuth):

Compute dp[i][j] for increasing j - i. For each (i, j), only try splits in [opt[i][j-1], opt[i+1][j]]. Amortized O(N²) instead of O(N³).

C (D&C DP):

For layer k, define solve(lo, hi, opt_lo, opt_hi): compute dp[k][lo..hi], knowing optimal split for each is in [opt_lo, opt_hi]. Recurse on midpoint m, then on solve(lo, m-1, opt_lo, opt[m]) and solve(m+1, hi, opt[m], opt_hi). Each level of recursion is O(N) work; depth is O(log N) → O(N log N) per k, O(KN log N) total.

Final Expected Approach

(See solution sketches inline in the variants above. Full implementations are 80–150 lines each in C++.)

Data Structures

A (CHT): deque of lines + intersection-checking helper
B (Knuth): dp[N][N], opt[N][N]
C (D&C): dp[K][N], recursive solver

Correctness Argument

CHT: A line y = mx + b is “dominated” if another line is below it at every x in the query range. The lower envelope contains exactly the non-dominated lines, in order of increasing slope.
Knuth: the quadrangle inequality implies monotonicity of opt. Proof in TAOCP Vol 3.
D&C DP: if opt[i] ≤ opt[i+1] (monotonicity), splitting the range and using this constraint reduces work logarithmically.

Complexity

Variant	Naive	Optimized
A	O(N²)	O(N) or O(N log N)
B	O(N³)	O(N²)
C	O(KN²)	O(KN log N)

Implementation Requirements

CHT: handle slopes carefully (sorted vs not); avoid division for intersections (use cross-multiplication with care for overflow)
Knuth: process diagonals in order of length; verify monotonicity in a debug build
D&C DP: pass index ranges + opt ranges; base case is lo > hi

Tests

Small N where brute force confirms answer
Edge: N = 1 (trivial answer)
All zero costs / probabilities
Monotonically increasing / decreasing costs
Stress: random instances at N = 1000 — compare optimized vs brute force

Follow-up Questions

CHT for max instead of min. → Maintain upper convex hull; symmetric.
Lines added in arbitrary slope order. → Use Li Chao Tree; O(log N) per insert and query.
Knuth not applicable (no quadrangle). → Either D&C DP (if opt is monotonic) or SMAWK (O(N) for totally monotone matrices).
D&C DP combined with CHT. → Possible; “Aliens trick” / Lagrangian relaxation.

Product Extension

CHT: dynamic programming in online ad bidding optimization, trading strategy optimization
Knuth: BST construction in language tools (rarely; usually use B-trees or hash maps)
D&C DP: optimal segmentation in time-series anomaly detection, network topology design

Language/Runtime Follow-ups

C++: all three implementable with stdlib. CHT often uses __int128 to avoid overflow in intersection checks.
Python: D&C DP works but with significant constant factor; CHT with Li Chao is feasible.
Java: BigInteger for safety on overflow-prone intersection checks.

Common Bugs

CHT: integer overflow in line intersection. Use long doubles or 128-bit.
CHT: deque pops the wrong end when slopes are descending vs ascending.
Knuth: didn’t verify the quadrangle inequality. Algorithm gives wrong answer silently.
Knuth: opt-range boundaries inclusive vs exclusive — off-by-one.
D&C: passed wrong opt range to recursive calls. Loses the monotonicity benefit.
D&C: base case (lo > hi) doesn’t return.

Debugging Strategy

For each technique, write a brute-force version side by side
Stress test with random small instances and assert equality
For CHT, print the hull after each insertion
For Knuth/D&C, log the chosen split points and verify monotonicity

Mastery Criteria

Recognize when a DP qualifies for each optimization
Implement CHT in ≤ 40 min
Implement Knuth in ≤ 30 min
Implement D&C DP in ≤ 30 min
State the prerequisite condition for each (linearity, quadrangle, monotonic-opt)
Estimate runtime for given N, K

LeetCode Interview — Extreme Coding