Lab 03 — Sieve of Eratosthenes and Smallest-Prime-Factor Factorization

Goal

Master prime enumeration and integer factorization at competitive scale: count primes up to N = 5 · 10^6 in under 100ms, and factorize Q = 10^5 integers ≤ N in O(log n) each via a precomputed smallest-prime-factor (SPF) table. By the end, you can write the linear sieve from blank in <5 minutes.

Background

Many CP problems reduce to “is x prime?”, “what are the prime factors of x?”, or “how many primes ≤ N?”. For N up to a few million, a sieve answers all three in O(N log log N) (Eratosthenes) or O(N) (Euler/linear sieve), and the SPF byproduct lets you factorize any x ≤ N in O(log x) time. For larger N (up to 10^9), you need Miller-Rabin primality tests and Pollard’s rho for factorization (out of scope here).

The sieve is the single most reused primitive across number-theoretic CP problems. Get this fluent and you save 5–10 minutes per problem.

Interview Context

LC 204 (Count Primes) is the canonical screen at FAANG mid-level. Quant interviews ramp it up: “factorize each of 10^5 numbers up to 10^7”. Both questions test the same primitive, but at different scales — the scale forces the candidate to pick the right data structure (bitset vs vector<bool> vs SPF table). A candidate who reaches for SPF without prompting signals CP fluency.

Problem Statement

Three sub-problems on the same engine:

Count primes ≤ N. LC 204. N ≤ 5 · 10^6.
Sum of primes ≤ N. Variant: instead of count, return sum mod 10^9 + 7.
Factorize Q numbers ≤ N. For each x_i, output its multiset of prime factors. Q ≤ 10^5, x_i ≤ N.

Constraints

N ≤ 5 · 10^6 for sub-problems 1 and 2.
N ≤ 10^7, Q ≤ 10^5 for sub-problem 3.
Time limit: 1 second.
Memory limit: 256 MB. SPF as int array uses 4 · 10^7 = 40 MB; fine.

Clarifying Questions

“Is 0 and 1 prime?” — neither. Standard convention.
“Factorization output: ordered or multiset?” — multiset, e.g., 12 → [2, 2, 3].
“Are inputs guaranteed ≤ N?” — yes for this lab. Otherwise switch to trial division O(√x) or Pollard’s rho.
“Need primes only or all factors?” — primes only here (canonical). Divisors are a separate problem.
“Single-threaded?” — yes. Sieves don’t parallelize trivially without contention.

Examples

Sub-problem 1: count_primes(10) = 4 (2, 3, 5, 7). count_primes(2) = 1. count_primes(1) = 0.
Sub-problem 2: sum_primes(10) = 17.
Sub-problem 3: factorize(12) = [2, 2, 3]. factorize(1) = []. factorize(7) = [7]. factorize(60) = [2, 2, 3, 5].

Initial Brute Force

For sub-problem 1: for each x in [2, N], run trial division up to √x.

def is_prime(x):
    if x < 2: return False
    for d in range(2, int(x**0.5) + 1):
        if x % d == 0: return False
    return True

def count_primes_brute(N):
    return sum(1 for x in range(2, N) if is_prime(x))

For sub-problem 3: trial division of each x_i.

def factorize_brute(x):
    out = []
    d = 2
    while d * d <= x:
        while x % d == 0:
            out.append(d); x //= d
        d += 1
    if x > 1: out.append(x)
    return out

Brute Force Complexity

Sub-problem 1 trial division: O(N √N). For N = 5·10^6, that’s ≈ 10^10 ops — impossible in 1 second.

Sub-problem 3 trial division per query: O(√x). For Q = 10^5 and x = 10^7, that’s ~3·10^8 ops — borderline. Sieve-based factorization is O(log x) ≈ 24 ops, ~2.4·10^6 total — comfortably faster.

Optimization Path

Trial division (above).
Sieve of Eratosthenes for sub-problems 1, 2: mark composites; remaining are primes. O(N log log N).
Linear (Euler) sieve + SPF table: each composite is crossed off exactly once via its smallest prime factor; the SPF byproduct enables O(log x) factorization. O(N) preprocessing.

For interviews, Eratosthenes is fine; for CP the linear sieve is the default once you’ve drilled it.

Final Expected Approach

Sieve of Eratosthenes (sub-problems 1, 2):

vector<bool> sieve(int N) {
    vector<bool> is_prime(N + 1, true);
    is_prime[0] = is_prime[1] = false;
    for (int i = 2; (long long)i * i <= N; ++i)
        if (is_prime[i])
            for (int j = i * i; j <= N; j += i)
                is_prime[j] = false;
    return is_prime;
}

Linear sieve with SPF table (sub-problem 3):

vector<int> spf;       // smallest prime factor
vector<int> primes;

void linear_sieve(int N) {
    spf.assign(N + 1, 0);
    primes.clear();
    for (int i = 2; i <= N; ++i) {
        if (spf[i] == 0) { spf[i] = i; primes.push_back(i); }
        for (int p : primes) {
            if ((long long)p * i > N || p > spf[i]) break;
            spf[p * i] = p;
        }
    }
}

vector<int> factorize(int x) {
    vector<int> out;
    while (x > 1) { out.push_back(spf[x]); x /= spf[x]; }
    return out;
}

Data Structures Used

vector<bool> (or bitset for memory efficiency) for the sieve mark array.
vector<int> for the SPF table (one int per index up to N).
vector<int> for the prime list.

A bitset packs 8× denser than vector<bool>; for very large N (5 · 10^7) it matters.

Correctness Argument

Eratosthenes correctness. Inductive claim: when iteration i starts, is_prime[j] is correct for all j < i. If is_prime[i] == true, no j < i has j | i (else it would have unmarked i); so i has no proper divisor < i, so i is prime. We then mark all multiples i², i² + i, i² + 2i, ... as composite. Multiples below i² were already marked by smaller prime divisors. By strong induction, the array is fully correct after the loop terminates.

Linear sieve correctness. Each composite n is marked exactly once, when i = n / spf(n) and the inner loop reaches p = spf(n). The loop guard p > spf[i] ensures p ≤ spf[i], so p · i has smallest prime factor p (because p ≤ spf[i] ≤ all other prime factors of i). So the assignment spf[p * i] = p is correct. Each composite has a unique (i, p) pair, hence is visited exactly once → O(N) total work.

Factorization correctness. While x > 1, spf[x] is x’s smallest prime factor, so emit it and divide. Every iteration strictly decreases x by a factor of at least 2, so loop runs ≤ log₂ x times.

Complexity

Eratosthenes: O(N log log N) time, O(N / 8) space with bitset.
Linear sieve + SPF: O(N) time, O(N) space.
Factorize each query: O(log x).
Total for sub-problem 3: O(N + Q log x).

Implementation Requirements

Use (long long)i * i to compute i² to avoid overflow at large N.
Sieve inner loop starts at i², not 2i — multiples below i² are already marked.
For sum-of-primes mod p, mod the running sum after each addition.
SPF allocation: N + 1 entries (for index N).

Tests

def test_count_primes():
    assert count_primes(10) == 4
    assert count_primes(2) == 1
    assert count_primes(1) == 0
    assert count_primes(100) == 25
    # Stress vs brute up to N = 1000
    for n in range(2, 1001):
        assert count_primes(n) == count_primes_brute(n)

def test_factorize():
    linear_sieve(1000)
    assert factorize(12) == [2, 2, 3]
    assert factorize(7) == [7]
    assert factorize(1) == []
    assert factorize(60) == [2, 2, 3, 5]
    # Stress vs brute
    for x in range(1, 1001):
        assert sorted(factorize(x)) == sorted(factorize_brute(x))

Edge cases: N = 0, N = 1, N = 2, prime N, prime power N.

Follow-up Questions

“What if N = 10^9?” — sieve is impossible (memory + time). Use Miller-Rabin for primality, Pollard’s rho for factorization.
“What if N = 10^11 and you need only the count?” — Meissel-Mertens method, O(N^(2/3)). Out of scope here.
“How would you parallelize the sieve?” — segmented sieve: split [2, N] into blocks of size √N, sieve each block independently after computing primes ≤ √N. Each block fits in cache; threads work on disjoint blocks.

Product Extension

Cryptography service: precompute small primes for trial division as a Miller-Rabin pre-filter. Number-theoretic libraries (GMP, FLINT) cache a small-prime table at startup. RSA key generation does trial division up to ~10^5 before falling back to Miller-Rabin, yielding ~10× speedup on rejecting composites.

Language / Runtime Follow-ups

Python: plain Python list of bools is slow; use bytearray (8× faster than list-of-bool). For N = 5·10^6, bytearray sieve runs in ~0.3 s; PyPy / Cython get to ~50 ms.
Java: BitSet is more memory-efficient than boolean[] and ~equally fast.
Go: plain []bool is fine; built-in bitset doesn’t exist (write your own with []uint64).
C++: vector<bool> is bit-packed by default; use bitset<N+1> for stack-allocated speed at compile-time-fixed N.
JS/TS: Uint8Array for the sieve. Number integer math is exact below 2^53.

Common Bugs

Sieve inner loop starting at 2i instead of i² — correct but slower.
Marking is_prime[0] and is_prime[1] as true.
Off-by-one: allocating N entries instead of N + 1.
In linear sieve, missing the p > spf[i] break condition → quadratic behavior.
Computing i * i as int and overflowing for i > ~46340.
For factorization output, dividing x by spf[x] but forgetting to also push spf[x].

Debugging Strategy

If sieve is wrong:

Print primes ≤ 30 from your sieve. Should be [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]. If 1 is included or 9 is missed, check init / inner loop start.

If factorize is wrong:

Print spf[2..30]. Should be [2,3,2,5,2,7,2,3,2,11,2,13,2,3,2,17,...] (smallest prime factor of each index).
If SPF correct but factorization wrong: the loop should emit spf[x] and divide; common bug is doing one or the other.

Mastery Criteria

Write Sieve of Eratosthenes from blank in <3 minutes.
Write linear sieve with SPF from blank in <5 minutes.
Articulate why each composite is marked exactly once in linear sieve.
Use SPF to factorize an integer in <30 seconds.
Recognize N ≤ 10^7 + factor-related question as the sieve trigger within 30 seconds.
State the alternative when N > 10^9 (Miller-Rabin + Pollard’s rho).

← Lab 02 — Binary Exponentiation · Phase 7 README · Lab 04 — Sweep Line →

LeetCode Interview — Extreme Coding