Phase 7 — Competitive Programming Acceleration

Target level: Hard → Codeforces Div 2 D (rating ~1900–2100) Expected duration: 2 months (12-month Elite track) / 4 weeks selective topics (6-month Serious track) / skipped or read-only (12-week Accelerated track) Weekly cadence: ~10 competitive topics + 6 labs + 2 contests/week + 30–60 problems applying them under the framework

A Direct Note On ROI Before You Spend Two Months Here

This phase has the lowest direct ROI per hour for FAANG SWE2 / L4 prep of any phase in this curriculum. If your goal is a Google L4, Meta E4, Amazon SDE2, or similar — you can skip this phase entirely and lose nothing. Phases 01 → 06 plus Phase 8 (practical engineering) cover essentially every problem you will see in those interviews. Modular inverse will not appear in your loop. Convex hull will not appear in your loop. Mo’s algorithm will not appear in your loop. The opportunity cost of two months on competitive programming is two months you could have spent on system design, behavioral prep, or sleep.

This phase has the highest direct ROI per hour for: HFT/quant interviews (Jane Street, HRT, Citadel, Two Sigma, Optiver, IMC, Jump), compiler/runtime/database internals teams (Google’s compiler infra, Microsoft’s CLR, Oracle’s HotSpot, ClickHouse, Snowflake’s query engine), distributed systems coding rounds at the senior+ level where contest-style problems are deliberately used as filters, ICPC-flavored test rounds at startups founded by ex-CP champions, and any interview where the explicit goal is to filter out everyone except the top ~5% of candidates by raw algorithmic horsepower. In those loops, the topics in this phase are not optional decoration — they are the test. A candidate who cannot derive a modular inverse, write binary exponentiation, or sweep events along a coordinate cannot pass an Optiver onsite no matter how good their system design is.

So: decide your target before you start this phase, and do not feel guilty about skipping it if the ROI calculation says skip. The rest of this README assumes you’ve decided to do the work.

What “Competitive Programming Acceleration” Actually Means

Competitive programming is not just “harder LeetCode”. It is a different sport with a different culture, different problem-solving rhythm, and different correctness bar. The differences that matter for interview prep:

Constraints are everything. A LeetCode Hard might say 1 ≤ N ≤ 10^5 and accept any O(N log N) solution. A Codeforces problem will say 1 ≤ N ≤ 5·10^5, T ≤ 10^4 testcases, sum of N ≤ 5·10^5, 2 second time limit, and your O(N log^2 N) solution will TLE while O(N log N) will pass with 200ms to spare. Reading constraints first — before the problem statement — is the single biggest skill jump from LeetCode to CP.
Problems are short. A typical CF Div 2 problem is 3–8 sentences plus 2 example testcases. Information density per word is 5–10× LeetCode. Skim-then-deep-read is wrong; deep-read on first pass is correct.
Brute force is a starting point, not an ending point. Submitting brute force on a contest problem to “lock in partial credit” is a LeetCode habit. On CF you submit only when you believe you have the intended complexity, because wrong submissions cost 50 points each (penalty time).
Stress testing is a normal part of the workflow. Top CP grandmasters run brute-vs-candidate stress tests against random inputs during a contest, on every problem they’re not 100% certain about. This is the muscle Lab 06 builds.
Editorials are a separate skill. After a contest, reading editorials productively (extracting transferable techniques, not just patching your specific solution) is half the learning. Most candidates read an editorial and take away nothing because they read it as a solution rather than as a textbook.

The competitive programming skill set translates to interview signal in three ways: (1) speed — you become physically faster at typing and debugging, which buys time for harder questions; (2) vocabulary — when an interviewer says “this is a sweep line problem” or “use binary search on the answer”, you have a direct reference rather than re-deriving from scratch; (3) pattern coverage — the long tail of “weird trick” problems that interviewers reach for to filter senior candidates is exactly the long tail of CP techniques.

What You Will Be Able To Do After This Phase

Read a Codeforces Div 4 / Div 3 problem in <2 minutes, decide brute-vs-intended in <1 minute, and submit Div 4 A–F or Div 3 A–E within contest time.
Reach Div 2 C consistently and attempt Div 2 D in ~50% of contests.
Read AtCoder Beginner Contest problems A–F and solve A–E reliably; reach F in ~50% of contests.
Reach AtCoder Regular Contest A–C, with C being the contest-finisher you usually upsolve afterward rather than solve in-contest.
Compute nCr mod p for p prime and n up to 10^7, with precomputed factorials and modular inverses, in <5 minutes from blank.
Implement binary exponentiation (a^b mod m) in <2 minutes and recognize when matrix exponentiation reduces a linear recurrence from O(N) to O(K^3 log N).
Implement the Sieve of Eratosthenes (basic and linear), the smallest-prime-factor sieve, and trial-division factorization, knowing when each is appropriate.
Implement modular inverse via Fermat’s Little Theorem (when modulus is prime) and via extended Euclidean (when it isn’t), and know which to reach for.
Implement Andrew’s monotone chain convex hull in <15 minutes and explain why cross product replaces division for orientation.
Implement a sweep line for the skyline problem and 1D rectangle union; recognize the “sort events, scan, maintain active set” pattern under disguise.
Implement coordinate compression as a one-line preprocessing step and combine it with Fenwick tree to count inversions in O(N log N).
Implement Mo’s algorithm with the canonical block-sqrt sorting comparator and explain its O((N + Q) √N) complexity.
Compute Sprague-Grundy numbers for impartial games and reduce composite games via XOR.
Write a stress-testing harness — brute, candidate, random generator, comparator — and use it to find a planted bug in <5 minutes.
Solve interactive CP problems (binary search a hidden value, query a hidden function) using line-buffered I/O and explicit flush discipline.
Configure fast I/O in your language of choice — cin/cout desync in C++, bufio.NewReader + bufio.NewWriter in Go, sys.stdin.readline + sys.stdout.write in Python, BufferedReader + PrintWriter in Java — without thinking about it.

How To Read This Phase

Read this README once, linearly, end-to-end. Do not try to memorize it. The 19 inline topic sections are reference material — internalized when you actually use them on contest problems, not by re-reading. The 9 progression sections are playbooks — they tell you which contests to enter and what the success bar is.

After the linear pass, do this in order:

Set up your CP toolchain — install your language compiler, configure fast I/O templates, get accounts on Codeforces and AtCoder.
Work Lab 01 through Lab 06 in order. The labs are designed so each one builds a primitive you reuse in the next.
Start the contest progression — Div 4 first, then Div 3, then Div 2. Do not skip Div 4 thinking it’s “too easy”; the goal there is speed, not difficulty.
After every contest, spend at least 2× the contest time on upsolving (problems you didn’t solve in-contest, with the editorial open). Upsolving is where the learning happens.

Each topic entry has a fixed shape:

Definition — what the technique is.
When Used — the problem signal that fires this technique.
Complexity — the canonical time/space.
Classic Problems — 2–4 representative LC / CF / AtCoder problems.
Pitfalls — the bugs that consume the most contest minutes for this technique.

The phase ends with a Mastery Checklist, Exit Criteria, and links to all six labs.

CP Problem-Solving Methodology — The Five-Step Loop

The single most teachable skill in competitive programming is the read → constraints → brute → submit → stress loop. Apply it to every problem.

Read fast. First read takes ~60 seconds. Goal: identify the problem class (graph? DP? math? sweep? game?) and the input/output format. Don’t try to solve yet. If you don’t understand on first read, re-read — but do not start sketching code.
Look at constraints before optimizing. This is the single biggest behavioral difference between CP and LeetCode habits. The constraint N ≤ 18 says bitmask DP. N ≤ 22 says meet-in-the-middle. N ≤ 5000 says O(N²). N ≤ 2·10^5 says O(N log N). N ≤ 10^9 says you don’t iterate N at all — math, binary search on the answer, or a closed form. The constraint is the algorithm choice. Read it first; do not write a single line of code without it.
Brute-force first, in your head. Even if brute force won’t pass, the brute force gives you (a) a correctness oracle for stress testing, (b) a starting point for optimization, (c) a 100% reliable answer to “do I understand the problem?”. If you can’t write the brute force, you don’t understand the problem yet — re-read the statement.
Submit early and often, but only when confident. Do not submit a partial / “maybe correct” solution to lock in points; CF/AtCoder penalize wrong submissions. If your code passes the sample inputs, that is necessary but not sufficient. Sample inputs are the easiest possible cases by construction; passing them is the floor, not the ceiling. Stress-test before submitting on any problem you’re <90% confident on.
Stress test if uncertain. Lab 06 builds this muscle. The pattern: brute (definitely correct, exponential), candidate (your fast solution), random generator (small inputs, N ≤ 10), comparator that runs both and dies on mismatch. Run it for 1000 random tests in 30 seconds. If it doesn’t fail in 1000 trials, it probably won’t fail on the judge.

The loop applies recursively. If you’re stuck in step 3 (can’t write brute force), drop to “what’s the absolute simplest version of this problem?” — usually a smaller N, a special case, or a related problem. Solve that first. That’s almost always how the intended solution is derived.

Inline Topic Reference

Math

1. Modular Arithmetic

Definition

Arithmetic over the residue ring Z/pZ (typically p = 10^9 + 7 or p = 998244353). Addition, subtraction, multiplication, and exponentiation are all defined modulo p. Division is not defined directly — see modular inverse.

When Used

Whenever the answer is “huge” — count of arrangements, count of paths, sum over all subsets — and the problem says output mod 10^9 + 7. This is the most common modifier on counting problems in CP.

Complexity

Addition / subtraction / multiplication are O(1). Watch for overflow: in C++ (a * b) % p overflows int when p ≈ 10^9; cast to long long first. In Java, % is signed (negative % of negative integer is negative); use ((a % p) + p) % p after subtraction. In Python, integers are arbitrary precision so overflow doesn’t happen, but performance suffers — keep numbers under p aggressively.

Classic Problems

CF 1342E (Placing Rooks) — counting arrangements mod 10^9 + 7.
AtCoder ABC 174 F — count distinct elements queries (off-topic but illustrates mod ergonomics).
LC 920 (Number of Music Playlists) — DP with mod.

Pitfalls

Forgetting to mod after every multiplication; the value silently overflows and silently corrupts answers.
Negative numbers after subtraction in C++/Java; always ((x % p) + p) % p.
Using % on double (always wrong; mod is integer-only).

See Lab 01 — Modular Arithmetic.

2. Modular Inverse

Definition

The modular inverse of a mod p is the unique x in [0, p) such that a · x ≡ 1 (mod p), when it exists. Existence requires gcd(a, p) = 1. When p is prime, every a ≠ 0 has an inverse.

Two computation methods:

Fermat’s Little Theorem (FLT): if p is prime, a^(p-1) ≡ 1 (mod p), so a^(p-2) ≡ a^(-1) (mod p). Use binary exponentiation in O(log p).
Extended Euclidean Algorithm (extgcd): find x, y such that a·x + p·y = gcd(a, p). If gcd = 1, then x mod p is the inverse. O(log min(a, p)).

When Used

Division by a mod p (replace n / a with n · inv(a)).
Computing nCr mod p from precomputed factorials: nCr = fact[n] · inv(fact[r]) · inv(fact[n-r]).
Probability problems where the answer is a fraction p/q modulo a prime; the answer is p · q^(-1) mod prime.

Complexity

O(log p) per inverse via either method. For batched inverses of n values, there’s a clever O(n) algorithm using the running product trick — useful when precomputing inverse factorials.

Classic Problems

CF 1462E2 (Close Tuples to Arrays, Hard) — nCr mod p heavy.
AtCoder ABC 178 F (Contrast) — combinatorics with mod.
CF 1342E — modular inverse for counting.

Pitfalls

Using FLT when p is composite — incorrect, must use extgcd.
Forgetting that inv(0) is undefined; guard before calling.
Using FLT when the modulus is prime but you accidentally pass p - 1 instead of p - 2.

3. Binary Exponentiation (Fast Power)

Definition

Compute a^b (or a^b mod m) in O(log b) time by exploiting the binary representation of b. The recurrence: a^b = (a^(b/2))^2 if b is even, a · a^(b-1) if b is odd.

long long power(long long a, long long b, long long m) {
    long long res = 1 % m;
    a %= m;
    while (b > 0) {
        if (b & 1) res = res * a % m;
        a = a * a % m;
        b >>= 1;
    }
    return res;
}

When Used

Anywhere you’d otherwise loop b times multiplying a. With b up to 10^18, naive looping is impossible; binary exponentiation is mandatory. Also the implementation engine for FLT-based modular inverse and matrix exponentiation.

Complexity

O(log b) multiplications. Each multiplication is O(1) for integers but O(K^3) for K×K matrices (giving O(K^3 log b) for matrix exponentiation).

Classic Problems

CF 630I, 630J — direct power computation.
LC 50 (Pow(x, n)) — the canonical binary exponentiation problem.
AtCoder ABC 178 D — DP with mod, uses fast power for inverses.

Pitfalls

Negative exponents (LC 50): handle as 1 / power(x, -n) and watch for INT_MIN (negating overflows).
Base case b = 0 returning 1, but 1 % m if m = 1 should be 0 — start with res = 1 % m.

See Lab 02 — Binary Exponentiation.

4. Matrix Exponentiation

Definition

For a linear recurrence f(n) = c_1 · f(n-1) + c_2 · f(n-2) + ... + c_k · f(n-k), the state vector [f(n), f(n-1), ..., f(n-k+1)] is obtained from the state vector at step n-1 by multiplying by a fixed k×k companion matrix M. Therefore the state at step n is M^n · initial_state, and M^n is computed by binary exponentiation in O(k^3 log n) time.

When Used

Linear recurrences where n is up to 10^18 and k (the recurrence depth) is small (typically k ≤ 60). The textbook example is Fibonacci modulo a prime for n = 10^18. Also: counting walks of length n in a graph (M = adjacency matrix), counting paths in a DFA, certain combinatorial DPs over a small fixed state space.

Complexity

O(k^3 log n) time, O(k^2) space. For k = 2 (Fibonacci), 8 log n multiplications mod p ≈ 500 ops for n = 10^18.

Classic Problems

Fibonacci mod p, n = 10^18 — the canonical introduction.
CF 392C (Yet Another Number Sequence) — matrix exponentiation with polynomial coefficients.
AtCoder DP Contest R (Walk) — counting walks of length K in a graph.

Pitfalls

Index off-by-one in the state vector (forgetting that the last entry is f(n-k+1), not f(n-k)).
Forgetting to mod every matrix multiplication entry.
Using nested Python lists instead of NumPy for matrices — Python is too slow for K ≈ 50 and log n ≈ 60.

See Lab 02 — Matrix Exponentiation for Fibonacci.

5. Sieve of Eratosthenes (and Linear Sieve)

Definition

Build a boolean array is_prime[0..N] in O(N log log N) time by, for each prime p ≤ √N, marking all multiples of p (starting from p²) as composite. The linear sieve variant produces the smallest prime factor (SPF) for every integer up to N in exactly O(N) time using the invariant “every composite is sieved once, by its smallest prime factor”.

When Used

Counting primes up to N for N ≤ 10^7 (Sieve of Eratosthenes is faster than trial division).
Generating all primes up to N for prime-related problems.
Building a smallest-prime-factor table for fast factorization (see Topic 6).
Euler’s totient phi(n) for all n ≤ N in O(N log log N).

Complexity

Sieve of Eratosthenes: O(N log log N), space O(N) (or N/8 with a bitset). Linear sieve: O(N), space O(N) for the SPF table.

Classic Problems

LC 204 (Count Primes) — sieve introduction.
CF 17A (Noldbach problem) — primes near pairs of primes.
Project Euler 10 (sum of primes below 2M) — sieve of size 2·10^6.

Pitfalls

Iterating to N instead of √N in the outer loop (correctness OK but O(N²)-flavor slow).
Starting the inner loop at 2p instead of p² (correct but slower; p², p²+p, p²+2p, ... is the optimal start).
Using vector<bool> in C++ is fine; bool[] is also fine. unordered_set<int> is not fine — too slow.

See Lab 03 — Sieve and Factorization.

6. Prime Factorization

Definition

Decompose n into its prime factors. Two main techniques:

Trial division. For each p = 2, 3, 5, ..., √n, while p | n, divide. Final n > 1 is itself prime. O(√n) per number.
Smallest-prime-factor (SPF) sieve. Precompute spf[i] = smallest prime dividing i, for all i ≤ N. Then factor any i ≤ N in O(log i) by repeatedly replacing i with i / spf[i]. O(N log log N) preprocessing; O(log i) per query.

When Used

Trial division when factoring a single large n (up to 10^14 is feasible).
SPF sieve when factoring many numbers in a range [1, N] for N ≤ 10^7.
For n up to 10^18, trial division is too slow; use Pollard’s rho (out of scope for this phase, in Phase 12).

Complexity

Trial division: O(√n). SPF sieve: O(log n) per query after O(N log log N) preprocessing.

Classic Problems

CF 1325E — factor and sum of exponents.
LC 263 (Ugly Number) — recursive division by small primes.
AtCoder ABC 169 D — factor and count exponents.

Pitfalls

Forgetting that after the loop if n > 1: append n as final prime. Easy to miss; corrupts every factorization where n has a prime factor > √n_initial.
Trial-dividing past √n; once p > √n, n is either 1 or itself prime.
Mixing up “number of distinct primes” with “number of prime factors with multiplicity” — these are very different (e.g., 12 = 2²·3 has 2 distinct, 3 with multiplicity).

See Lab 03 — Sieve and Factorization.

7. Combinatorics (`nCr mod p`)

Definition

Compute binomial coefficients modulo a prime. For repeated queries, precompute fact[i] = i! mod p and inv_fact[i] = (i!)^(-1) mod p for i up to N. Then nCr = fact[n] · inv_fact[r] · inv_fact[n-r] mod p in O(1) per query.

For n very large (up to 10^18) and p small (p ≤ 10^5), use Lucas’s theorem: C(n, r) mod p = ∏ C(n_i, r_i) mod p, where n_i, r_i are the base-p digits of n, r. The inner C(n_i, r_i) are computed directly because n_i, r_i < p.

When Used

Counting paths in a grid (C(m+n, m)).
Stars-and-bars: distribute n identical items into k bins → C(n+k-1, k-1).
Inclusion-exclusion sums.
Probability with combinatorial denominators.
Lucas: when n is up to 10^18 (e.g., AtCoder ABC 167 E or grid problems with huge dimensions).

Complexity

Preprocess O(N). Each nCr query O(1). Lucas’s theorem: O(p + log_p(n)) per query (assuming preprocessed factorials up to p).

Classic Problems

CF 1342E — uses nCr.
LC 62 (Unique Paths) — direct C(m+n-2, m-1).
AtCoder ABC 167 E (Colorful Blocks) — inclusion-exclusion with nCr.

Pitfalls

Forgetting to precompute inv_fact separately; computing each query as fact[n] / (fact[r] · fact[n-r]) and trying to use integer division mod p (this is wrong; you need modular inverse).
Off-by-one in fact[] array (forgetting fact[0] = 1).
For Lucas, forgetting that any r_i > n_i gives C(n_i, r_i) = 0, so the whole product is 0.

See Lab 01 — Modular Arithmetic.

8. GCD, LCM, Extended Euclidean

Definition

gcd(a, b) is the greatest common divisor of a, b. Computed by Euclidean: gcd(a, b) = gcd(b, a mod b), base case gcd(a, 0) = a.
lcm(a, b) = a · b / gcd(a, b). Compute as a / gcd(a, b) · b to avoid overflow on intermediate a · b.
Extended Euclidean algorithm finds, alongside gcd(a, b), integers x, y such that a·x + b·y = gcd(a, b). This is the engine for modular inverse when the modulus isn’t prime.

When Used

Reducing fractions.
Solving linear Diophantine equations a·x + b·y = c (solution exists iff gcd(a, b) | c).
Modular inverse via extgcd when the modulus is composite.
Cycle-length problems where the answer involves an LCM.

Complexity

O(log min(a, b)) for both gcd and extgcd.

Classic Problems

CF 822A — direct LCM use.
LC 1071 (Greatest Common Divisor of Strings) — repurposed gcd.
AtCoder ABC 162 D — gcd in a counting problem.

Pitfalls

lcm(a, b) = a * b / gcd(a, b) overflows when a, b are around 10^9. Reorder: lcm = a / gcd * b.
gcd(0, 0) is conventionally 0, but C++ __gcd(0, 0) returns 0; some libraries return undefined. Guard.
Negative a, b: gcd should always be non-negative; some implementations return signs. Use abs().

Geometry

9. Coordinate Geometry Basics (Cross Product, Orientation)

Definition

For two 2D vectors u = (ux, uy) and v = (vx, vy), the cross product is the scalar ux·vy − uy·vx. Its sign tells you the relative orientation of the vectors: positive = counter-clockwise turn, negative = clockwise, zero = collinear. The orientation of three points A, B, C is the sign of the cross product of B − A and C − A; this is the most-used primitive in computational geometry.

When Used

Determining whether three points form a left turn, right turn, or are collinear (convex hull, polygon orientation).
Determining whether a point is on, left of, or right of a line.
Computing twice the signed area of a triangle (the cross product is twice the signed area).
Computing twice the signed area of a polygon (shoelace formula = sum of cross products).

Complexity

O(1) per cross product / orientation test.

Classic Problems

LC 587 (Erect the Fence) — convex hull, uses orientation.
CF 70D (Dynamic Convex Hull) — uses orientation heavily.
AtCoder ABC 207 D — geometry with cross products.

Pitfalls

Using floating-point for cross product when integer arithmetic would suffice — introduces rounding errors that cause “almost collinear” misclassifications. Use long long (or arbitrary precision) when coordinates are integers.
Confusing CCW (counter-clockwise) with CW (clockwise) sign convention.
Overflow in cross product: with coordinates up to 10^9, the product is up to 10^18, which fits long long but not int.

10. Convex Hull (Andrew’s Monotone Chain)

Definition

Given a set of 2D points, the convex hull is the smallest convex polygon containing all of them. Andrew’s monotone chain algorithm sorts points by (x, y), then builds the lower hull left-to-right and the upper hull right-to-left, using the cross-product orientation test to pop points that make a right turn (in the lower hull) or left turn (in the upper hull).

sort(points.begin(), points.end());
vector<P> hull;
// lower hull
for (auto &p : points) {
    while (hull.size() >= 2 && cross(hull[hull.size()-2], hull.back(), p) <= 0)
        hull.pop_back();
    hull.push_back(p);
}
// upper hull
int lower_size = hull.size() + 1;
for (int i = points.size() - 2; i >= 0; --i) {
    while (hull.size() >= lower_size && cross(hull[hull.size()-2], hull.back(), points[i]) <= 0)
        hull.pop_back();
    hull.push_back(points[i]);
}
hull.pop_back();  // last point is the start, duplicated

When Used

Smallest enclosing polygon problems.
Diameter of a point set (rotating calipers on the hull).
Pre-step for various 2D optimization problems (convex layers, dynamic hulls).

Complexity

O(N log N) — dominated by the sort. The two hull-building passes are O(N) amortized.

Classic Problems

LC 587 (Erect the Fence) — direct convex hull.
CF 1093E — uses convex hull as a subroutine.

Pitfalls

<= 0 vs < 0 in the orientation test: <= 0 removes collinear points from the hull (giving the strict hull); < 0 keeps them (giving the inclusive hull). LC 587 wants the inclusive hull (use < 0); most CP problems want the strict hull (use <= 0).
Forgetting to remove the duplicated last point.
Sorting tuples lexicographically without a tie-break — for points with the same x but different y, the sort order matters; (x, y) lexicographic is the right tie-break.

11. Closest Pair of Points (Divide & Conquer Overview)

Definition

Given N points in 2D, find the pair with the smallest Euclidean distance. The naive O(N²) algorithm is to compare every pair. The classical O(N log N) algorithm sorts by x, recursively solves the left and right halves, finds the minimum distance d of the two halves, then merges by inspecting only points within horizontal distance d of the dividing line — and within those, only y-neighbors within distance d. The merge step is O(N) because each strip point only needs to compare against ~6 nearest y-neighbors.

When Used

Direct closest-pair problems.
Any problem where you need a guarantee on minimum spacing (geometric clustering, collision detection).

Complexity

O(N log N) time, O(N) space. The recursion T(N) = 2 T(N/2) + O(N) resolves to O(N log N).

Classic Problems

Codeforces educational round problems labeled “closest pair”.
UVa 10245 (The Closest Pair Problem) — direct.

Pitfalls

For most interview problems, N is small enough (≤ 5000) that O(N²) brute force passes, and writing the divide-and-conquer version is not worth the complexity.
Floating-point distance comparison: compare squared distances (integers, exact) instead of square-rooted distances (floats, lossy).

Sweep & Queries

12. Sweep Line

Definition

A sweep line algorithm imagines a vertical (or horizontal) line sweeping across the plane (or 1D number line) and processing events in the order the sweep encounters them. At each event, you update an “active set” — typically a balanced BST or a multiset — and answer queries based on the current state. The key insight is that between events, the active set is constant, so you only need to process at events.

When Used

1D rectangle/interval union (sum of lengths).
2D rectangle union area (sweep y-coordinate; active set = x-intervals).
Segment intersection problems (Bentley-Ottmann).
Skyline problem (LC 218).
Closest pair (alternate formulation).

Complexity

Typically O((N + E) log N) where E is the number of events; for rectangle union, E = O(N), giving O(N log N).

Classic Problems

LC 218 (The Skyline Problem) — canonical.
LC 850 (Rectangle Area II) — 2D rectangle union.
AtCoder ABC 188 D — 1D event-sweep counting.

Pitfalls

Tie-breaking on event time: when multiple events occur at the same x, process all opens before closes (or vice versa, problem-dependent). Wrong order → off-by-one in the active set.
Using set<int> for the active set when you need to handle duplicate values; switch to multiset<int>.
Updating the answer based on the active set after processing all events at the current x, not in the middle.

See Lab 04 — Sweep Line for Skyline.

13. Coordinate Compression

Definition

Replace large/sparse coordinate values with their ranks in the sorted set of distinct coordinates. If your data has values [10^9, 5, 10^7, 5, 1], compression maps them to ranks [3, 1, 2, 1, 0]. The transformed problem has the same structure but coordinates fit in [0, N), enabling array-indexed data structures (Fenwick tree, segment tree, bucket sort).

sorted_unique = sorted(set(values))
rank = {v: i for i, v in enumerate(sorted_unique)}
compressed = [rank[v] for v in values]

When Used

Counting inversions with a Fenwick tree (values up to 10^9 → compress to [0, N)).
2D rectangle union via sweep line + segment tree on y-coordinates.
DP with a state indexed by a coordinate that’s too large to enumerate.
Almost any problem with value ≤ 10^9 where you would otherwise need a hashmap of size N.

Complexity

O(N log N) for sorting and deduplication; O(N) after that to relabel.

Classic Problems

LC 315 (Count of Smaller Numbers After Self) — Fenwick + compression.
CF 51A — geometry with coord compression.
AtCoder ABC 174 F — distinct elements queries.

Pitfalls

Forgetting to use set (deduplicate) before sorting; otherwise duplicate values get different ranks, breaking equality comparisons.
Using compression when not needed (values already in a small range) — adds complexity for no benefit.

See Lab 05 — Coordinate Compression for Inversions.

14. Mo’s Algorithm

Definition

An offline algorithm for answering Q range queries on an array in O((N + Q) √N) total time when (a) you can move the answer from [l, r] to [l-1, r], [l+1, r], [l, r-1], [l, r+1] in O(1) (or O(log N)); and (b) the queries can be reordered. The trick: sort queries by (l / B, r) where B = √N. Then within a block, r only increases, so total r movement is O(N) per block × √N blocks = O(N √N). Across blocks, l movement is O(√N) per query × Q queries = O(Q √N).

When Used

“Number of distinct values in [l, r]” queries.
“Sum of f(count(v)) for distinct v in [l, r]” queries.
Mode of a range (with auxiliary frequency-of-frequency structure).
Many problems labeled “offline range queries with no updates” on Codeforces.

Complexity

O((N + Q) √N). With N = Q = 10^5, that’s about ~3.2·10^7 operations — passes a 2-second limit comfortably.

Classic Problems

CF 86D (Powerful array) — sum of cnt² · v over distinct v.
SPOJ DQUERY — distinct values in a range.
CF 220B — count of values equal to their frequency.

Pitfalls

The optimal block size is √N; smaller values of B cause TLE because r-movement within a block is too long.
Mo’s algorithm doesn’t handle online queries — queries must be batched and reordered.
The “add/remove element in O(1)” requirement is strict; an O(log N) add/remove makes the total O((N + Q) √N · log N), which usually TLEs.

15. Offline Binary Search / Parallel Binary Search

Definition

When you have Q independent binary-search queries, each of which would naively take O(log V · F) where F is some function evaluation cost, parallel binary search runs all Q queries’ binary searches in lockstep. At each binary-search step, group the queries by their current candidate midpoint, evaluate F once per group, and update each query’s interval. Total cost: O(log V · (F + Q)) instead of O(Q · log V · F).

When Used

“For each query, find the smallest t such that some property P_query(t) holds”, where P is monotone in t and evaluating P is expensive (e.g., requires processing the first t operations of a stream).
Problems where each query is a binary search over time/index/threshold and the function F(t) changes globally with t.

Complexity

O(log V · (F + Q)). For V = N, F = O(N), Q = N: O(N log N) total.

Classic Problems

CF 813F (Bipartite Checking) — parallel binary search on offline DSU.
POI Meteors — the canonical introduction.

Pitfalls

Forgetting that this technique requires queries to be independent (one query’s answer doesn’t depend on another).
Conceptually heavier than Mo’s algorithm; for interview prep, knowing the technique exists and recognizing its signal is more important than implementing it from blank.

Game Theory & Misc

16. Sprague-Grundy / Nim

Definition

In an impartial two-player game (both players have the same available moves at every position), each position has a Grundy number g(pos) defined recursively: g(pos) = mex { g(next) : next ∈ moves(pos) }, where mex is the minimum excluded value (smallest non-negative integer not in the set). A position has Grundy number 0 iff it’s losing for the player to move. Nim’s theorem: the Grundy number of a sum of independent games is the XOR of their individual Grundy numbers.

When Used

Any “two-player game, both move optimally, who wins?” problem with no chance / no hidden information / both players have the same moves.
Decomposing complex games into sums of simpler subgames.
Standard Nim (multiple piles, take any number from any pile, last to move wins): the answer is “first player wins iff XOR of pile sizes ≠ 0”.

Complexity

Computing Grundy via memoization: O(states · branching). For games with state space up to 10^6, this is feasible.

Classic Problems

Standard Nim, multi-pile — XOR of pile sizes.
CF 95A — Grundy on a stair-step game.
AtCoder ABC 195 D — game DP related (not pure Grundy but related).

Pitfalls

The theorem only applies to impartial games. Partisan games (chess, where pieces are colored) don’t satisfy Sprague-Grundy.
“Last to move loses” (misère convention) does NOT in general have the simple XOR rule — only the “last to move wins” (normal convention) does, except in degenerate cases.
Large branching factor + large state space → memoization table doesn’t fit. Look for closed-form patterns by computing Grundy for small n and spotting periodicity.

17. Randomized Algorithms / Stress Testing

Definition

Two related concepts:

Randomized algorithms: algorithms that use random choices to achieve good expected complexity (randomized quicksort, treap, hash-based string matching). Las Vegas algorithms are always correct, randomized in time; Monte Carlo are randomized in correctness.
Stress testing (the bigger interview-prep topic): writing a small brute-force solver, your candidate optimal solver, a random input generator, and a comparator that runs both on every random input until they disagree. This is how CP grandmasters find bugs in their own solutions.

When Used

Stress testing: on every problem you’re not 100% confident about, before submitting.
Randomized algorithms: when a deterministic guarantee isn’t required (probabilistic data structures: Bloom filter, count-min sketch, treap, randomized convex hull).

Complexity

Stress testing is overhead-only — if both your brute and candidate are fast enough on small inputs, stress testing is essentially free. 1000 random tests on N = 10 finishes in <1 second.

Classic Problems

This is a meta-skill, not a problem class. See Lab 06 — Stress Testing.

Pitfalls

Random generator that doesn’t cover edge cases (e.g., always generating distinct elements when the bug is in duplicate handling). Generate adversarially: small N, small value range, allow duplicates.
Comparator that compares output as strings without normalizing whitespace — false positives.
Not seeding the RNG; one accidentally-passing run hides the bug.

See Lab 06 — Stress Testing.

18. Interactive Problems (CP-Style)

Definition

The problem statement defines an interactive protocol: you ask the judge queries (e.g., “is element i greater than element j?”), the judge answers, and after at most K queries you must report the answer. The judge runs as a subprocess and communicates via stdin/stdout. The technique is usually a binary search, ternary search, or adaptive query strategy bounded to O(log N) queries.

When Used

“Find a hidden value in [1, N] in at most log₂ N queries” — straight binary search.
“Find the minimum of a unimodal function” — ternary search.
Adversarial / interactive game-tree problems.

Complexity

O(log N) queries for binary/ternary search; the algorithm is otherwise mostly bookkeeping.

Classic Problems

CF 1207E (XOR Guessing) — adaptive queries.
CF 1486D (Max Median) — interactive binary search.
AtCoder ABC 178 D — not interactive but related decision-problem-as-binary-search.

Pitfalls

Forgetting to flush stdout after every query. This is the single most common interactive bug. In C++: cout << ... << endl; (or cout.flush();); in Python: print(...); sys.stdout.flush() or print(..., flush=True). In Go: bufio.NewWriter with explicit Flush(). If you don’t flush, the judge sees nothing, your program waits for input that never comes, you TLE.
Mixing cin/cout with scanf/printf — buffering interleaves badly.
Reading the judge’s response on the wrong line because of an off-by-one in the query loop.

19. Fast I/O

Definition

The default I/O mechanisms in most languages are line-buffered, locale-aware, and format-aware — which makes them slow. For CP, where you might read 10^6 integers in a 1-second time limit, fast I/O is mandatory. The technique varies by language:

C++: ios_base::sync_with_stdio(false); cin.tie(nullptr); — disconnects cin/cout from C stdio. Speed-up: ~5×. Even faster: scanf/printf directly.
Java: BufferedReader + StreamTokenizer for input; PrintWriter (with explicit flush()) for output. Scanner is too slow for CP — never use it.
Python: sys.stdin.readline instead of input(); sys.stdout.write instead of print for hot loops. For massive input: data = sys.stdin.buffer.read().split() and parse from there. PyPy3 is 5–10× faster than CPython for raw computation; use it whenever available.
Go: bufio.NewReader(os.Stdin) and bufio.NewWriter(os.Stdout); always defer writer.Flush(). fmt.Scan is slow; use a custom token-by-token reader.
JavaScript / TypeScript (Node.js): process.stdin raw read, parse all input at once. Generally the slowest mainstream CP language; not recommended for N ≥ 5·10^5.

When Used

Always, on every CP problem with large input. Cost-of-not-using-fast-I/O: 5–10× slowdown, the difference between AC and TLE.
Less critical on LeetCode-style problems where input is already parsed for you.

Complexity

I/O is O(input_size) either way; fast I/O reduces the constant by ~5–10×.

Classic Problems

Any problem with N ≥ 10^6 integers as input. The constraint itself is a hint: “you need fast I/O”.

Pitfalls

Mixing buffered and unbuffered I/O in the same program. In C++, after sync_with_stdio(false), do not mix cin with scanf or cout with printf. The buffers are independent and output appears out of order.
Forgetting to Flush() in Go. Your output disappears entirely.
Java Scanner. Don’t.
Python print in a loop of 10^6 iterations. Each call locks stdout, flushes, formats — lethal for CP. Buffer with '\n'.join(...) and one final sys.stdout.write.

Progression Playbooks (How To Practice Each Contest Track)

The 19 topics above are the vocabulary. The progression playbooks below are the training plans — which contests to enter, what the success bar is, and what to do after.

1. Codeforces Div 4 Progression

Target rating: unrated → 1200. Goal: solve A–F reliably, in <90 min, cleanly. Contest cadence: every Div 4, ~2/month. Why Div 4: the floor of CF; problems are LC-Easy to LC-Medium difficulty but with CF-style constraints. The skill being trained is speed — you should never get stuck on a Div 4 problem; if you do, the bottleneck is reading/typing speed, not algorithm knowledge.

How to practice: enter every Div 4 live. Aim for A–E in <60 min, F in <90 min total. After contest, upsolve any problems you missed, with editorial open. Track your fastest A–E solve times in a spreadsheet — they should drop ~30% over your first 5 Div 4 contests.

Exit criterion: solve A–F in 6/6 problems consistently in <100 min.

2. Codeforces Div 3 Progression

Target rating: 1200 → 1500. Goal: solve A–E reliably, attempt F. Contest cadence: every Div 3, ~2/month. Skill trained: pattern recognition. Div 3 problems mostly use the patterns from this curriculum (sweep, two pointers, basic DP, basic graph), but the framing is less explicit than LeetCode. You’ll see “given an array, do f(...)” with no LC tag telling you “this is binary search on the answer”.

How to practice: enter every Div 3 live. Aim for A–D in <60 min, E in <120 min. Upsolve E and F after contest. Read the editorial carefully — even if you solved it, see if there’s a slicker approach.

Exit criterion: solve A–E in 5/6 problems consistently within contest time.

3. Codeforces Div 2 Progression

Target rating: 1500 → 1900. Goal: solve A–C reliably, attempt D in 50% of contests. Contest cadence: every Div 2, ~3/month. Skill trained: problem-solving creativity. Div 2 D is where “knowing the technique” stops being enough — you must combine techniques. Div 2 D might require a sweep + DP, or a binary search + greedy, or a Fenwick tree + coordinate compression.

How to practice: enter every Div 2 live. Don’t worry about D in your first 10 contests. Aim for A, B, C clean. After contest, upsolve D with the editorial open; the goal is to learn techniques, not to spend 4 hours stuck. After 10 contests, start attempting D in-contest.

Exit criterion: solve A–C in 8/10 contests; D in 4/10 contests.

4. AtCoder Beginner Contest Progression

Target rating: unrated → 1400 (AtCoder). Goal: solve A–F. Contest cadence: every Saturday/Sunday, ~4/month. Why ABC: AtCoder’s problem statements are remarkably clean (often a single math problem stated tersely), and the difficulty curve A–F is smoother than CF Div 2/3. ABC F is roughly equivalent to CF Div 2 D but with cleaner statements.

How to practice: enter every ABC live. Aim for A–E in <60 min, F in <100 min. ABC F is famous for requiring exactly one “aha” insight per problem; if you don’t see it, move on and upsolve afterward — don’t grind in-contest.

Exit criterion: solve A–E in 9/10 contests; F in 5/10 contests.

5. AtCoder Regular Contest Progression

Target rating: 1400 → 1800 (AtCoder). Goal: solve A–C; attempt D. Contest cadence: ~1/month. Why ARC: ARC is harder than ABC and tests deeper CP techniques — segment tree beats, advanced combinatorics, harder geometry. ARC C is approximately CF Div 1 B / Div 2 E.

How to practice: enter every ARC live. Don’t expect to finish A–C in your first 5 attempts. Upsolve C and D after every contest. ARC is the contest where editorial reading delivers the most learning per minute, because the problems are designed around a specific technique that the editorial will name.

Exit criterion: solve A–B in 8/10 contests; C in 3/10 contests.

6. Stress Testing Methodology

Skill trained: finding bugs in your own solutions before the judge does. Cadence: every problem you’re <90% sure about. Tools: brute solver, candidate solver, random generator, comparator script. See Lab 06 for the full implementation.

How to practice: during a contest, when your candidate passes the samples but you have any doubt, write the stress test. It takes ~3 minutes; it saves the 50-point penalty of a wrong submission, plus the 30 minutes of re-solving after a WA. After contest, on every problem you got wrong, write a stress test that finds your bug. This builds the habit until it’s automatic.

Exit criterion: in 3 consecutive contests, no wrong submissions caused by bugs that would have been caught by stress testing.

7. Reading Editorials Productively

Skill trained: extracting transferable techniques rather than specific solutions. Cadence: after every contest, every problem. Why it matters: the difference between a 1500-rated and a 1900-rated coder is mostly that the 1900 has read 5× more editorials and retained them.

How to practice:

Read the editorial before re-implementing your wrong solution. Do not patch your code; rewrite from blank using the editorial’s approach.
Identify the technique name the editorial uses. “This is binary search on the answer.” “This is a two-pointer sliding window.” Add it to your personal technique catalog (a markdown file with one line per technique → one problem where you saw it).
If the editorial is terse (AtCoder editorials are famously curt), look for community write-ups on Codeforces blogs.
Re-implement from blank, in your own style.

Exit criterion: for every solved problem in your CP journal, you can name the technique and cite one other problem that uses it.

8. Implementation Speed Drills

Skill trained: typing your way out of the problem-statement-to-AC pipeline as fast as possible. Cadence: weekly, 30 min. Why it matters: in a 2-hour Div 2, you have ~25 min per problem on average. If you spend 10 min on syntax errors, you’ve lost 40% of your budget.

How to practice: pick 5 problems you’ve already solved. Type them again, from blank, against a stopwatch. Your second attempt should be 3–5× faster than your first. Do this on the canonical primitives — Sieve, modular inverse, binary exponentiation, Fenwick tree, sweep line — until you can write each from blank in <5 minutes without referring to notes.

Exit criterion: all 6 lab implementations in this phase, written from blank in <15 minutes each.

9. Contest-Time Strategy (Problem Ordering, When To Skip, When To Stress Test)

Skill trained: allocating limited contest time. Cadence: every contest.

The contest-time playbook:

First 10 minutes: read all problems briefly. Mark each as “trivial / hard / ?”. Solve the trivial ones first to build momentum.
Next 30 minutes: solve the medium ones. Allocate ~15 min per problem.
When stuck for 15 min on one problem: skip. Move to the next. Come back later with a fresh perspective. The cost of grinding stuck is paid in opportunity cost on the next problem.
When passing samples but uncertain: write a stress test. 3 minutes invested, 50-point penalty avoided.
Last 30 minutes: decide between (a) attempting a hard problem you haven’t started, or (b) re-checking your earlier solutions. (b) is usually higher ROI unless the hard problem is worth a lot.
Never give up before time expires. Even if every problem is solved or skipped, re-read the hardest unsolved problem one more time — sometimes the third reading triggers an insight the first two didn’t.

Common strategy mistakes:

Grinding A–B for too long when they should take <15 min total. If A is taking >20 min, you’re misreading; re-read.
Submitting before stress-testing on a problem you’re <90% sure about. The penalty hurts more than the time investment.
Skipping the editorial post-contest because “I’ll do it later”. Later never comes.

Mastery Checklist

Tick when each item is true unprompted — i.e., you’d reach for it without consulting notes.

Read constraints first on every problem; can articulate why N ≤ 18 says bitmask, N ≤ 5000 says O(N²), etc.
Modular inverse via FLT in <2 minutes from blank; can switch to extgcd when modulus is composite.
Binary exponentiation for integers in <2 minutes from blank.
Matrix exponentiation for Fibonacci in <10 minutes from blank, including the matrix multiplication primitive.
Sieve of Eratosthenes (basic) in <3 minutes; SPF sieve in <5 minutes; trial-division factorization in <2 minutes.
nCr mod p with precomputed factorials in <5 minutes from blank.
gcd/lcm/extgcd in <3 minutes from blank.
Cross product orientation test on demand; can identify CCW/CW/collinear by sign.
Convex hull (Andrew’s monotone chain) in <15 minutes from blank.
Sweep line for skyline in <20 minutes from blank.
Coordinate compression as a one-line preprocessing.
Mo’s algorithm template (block sort + add/remove handlers) in <20 minutes from blank.
Sprague-Grundy on demand for impartial games up to small state space.
Stress-testing harness (brute, candidate, generator, comparator) in <10 minutes from blank.
Interactive-problem template with explicit flush after every query.
Fast I/O configured by reflex in your primary language.
Read an editorial productively: name the technique, find one other problem using it.
Codeforces Div 3 A–E in 5/6 contests.
Codeforces Div 2 A–C in 8/10 contests.
AtCoder ABC A–F in 5/10 contests.

Exit Criteria

You graduate Phase 7 when all of the following hold:

You have entered ≥10 Codeforces contests live (any division) and ≥10 AtCoder Beginner Contests live.
You have a Codeforces rating of ≥1500 OR you have solved Codeforces Div 2 D in ≥3 contests, in or out of contest.
All 6 labs in this phase are completed with all mastery criteria ticked.
You can explain — out loud, in <60 seconds — what each of the 19 inline topics is, when to reach for it, and what its complexity is.
You have a personal CP journal with ≥30 entries, each one linking the problem name to the named technique used.
You have a stress-testing harness saved as a snippet in your editor and have used it to find a bug in your own code at least 5 times.

If any of these is missing — especially the contests and the journal — you have not exited this phase. Add 2 weeks and re-check.

Labs

#	Lab	What It Builds
1	Modular Arithmetic	`nCr mod p` with factorial precomputation + modular inverse
2	Binary Exponentiation	`a^b mod p` and matrix exponentiation for Fibonacci
3	Sieve and Factorization	Count primes, sum of primes, SPF table for fast factorization
4	Sweep Line	The Skyline Problem (LC 218) via canonical sweep
5	Coordinate Compression	Counting inversions via Fenwick tree + compression
6	Stress Testing	Brute + candidate + random generator + comparator harness

← Phase 6 — Greedy · Phase 8 — Practical Engineering →

LeetCode Interview — Extreme Coding