Lab 02 — Constraints Extraction
Goal
Train the discipline of converting written constraints into algorithmic targets. Given 1 ≤ N ≤ 10^5, you should immediately think “O(N log N) is the budget, O(N^2) will TLE”.
Background Concepts
- The constraint-to-complexity mapping (FRAMEWORK.md)
- Operations-per-second budget on competitive judges (~10^8 simple ops/sec safe)
- Implicit constraints (memory, integer overflow, recursion depth)
Interview Context
In live interviews, the interviewer often omits explicit constraints, expecting you to ask. Constraints discipline the algorithm choice. Candidates who skip this step often write an O(N^2) algorithm that the interviewer was hoping they’d avoid.
Problem Statement
For each of the 10 problem prompts below, derive:
- The complexity budget
- At least one algorithm family that fits
- At least one approach that does not fit and why
Prompts:
1 ≤ N ≤ 20— count subsets satisfying property X1 ≤ N ≤ 200— shortest path in weighted graph with up to N^2 edges1 ≤ N ≤ 10^4, queries≤ 10^5— range sum1 ≤ N ≤ 10^5— find duplicate1 ≤ N ≤ 5 × 10^5— kth smallest in array1 ≤ N ≤ 10^6, integers up to 10^9 — count of pairs with sum ≤ KN ≤ 10^9, queries Q ≤ 10^5 — count of integers in[1,N]divisible by KT test cases, T ≤ 10^4, each withN ≤ 10^3— pairwise XOR maximum1 ≤ a, b ≤ 10^18— computea^b mod pStream of up to 10^7 elements— top-K running
Constraints
The point of this lab is constraints. Treat each prompt as separate.
Clarifying Questions
For each prompt, list:
- What’s the operation budget?
- Is the time limit explicit (e.g., 1 sec, 2 sec)?
- Is there a memory limit (e.g., 256 MB)?
- Are values within int32 / int64 range?
Examples
Worked example for prompt #4 (N ≤ 10^5, find duplicate):
- Budget: ~10^7–10^8 ops → O(N), O(N log N), O(N · log_max) all fit
- Fits: hashset O(N), sort+scan O(N log N), Floyd’s cycle if input ∈
[1,N] - Doesn’t fit: O(N^2) brute force (10^10 ops)
Initial Brute Force
Not applicable — meta-lab.
Brute Force Complexity
N/A.
Optimization Path
The optimization path here is constraint-to-algorithm mapping. Memorize the table from FRAMEWORK.md.
Final Expected Approach
For each prompt, write your final answer as:
Prompt #K:
Budget: O(...)
Fitting algorithm family: ...
Disqualified approach: ... because budget / memory / overflow / etc.
Data Structures Used
The point is to map N range → DS choice:
N ≤ 20: bitmask, recursion (no DS needed)N ≤ 200: 2D arrays (Floyd, etc.)N ≤ 10^4: O(N^2) DP arrays, simple sortN ≤ 10^5: hashmap, heap, sorted set, segment tree, FenwickN ≤ 10^6: array + linear scan, two pointers, hash, no log factorsN ≤ 10^9: math, binary search on answer, sieve segment
Correctness Argument
The argument here is budget: justify why your chosen algorithm fits within ~10^8 ops/sec * time-limit. Be explicit: N · log N = 10^5 · 17 ≈ 1.7 × 10^6 — comfortably fits 1-second limit.
Complexity
For each prompt, you produce both:
- The budget
- The justification per the table
Implementation Requirements
Written deliverable. No code.
Tests
Not applicable for this lab.
Follow-up Questions
For prompt #6 (10^6 elements, pairs with sum ≤ K):
- “What if values can be negative?” → may need different sort + two-pointer logic
- “What if we want to enumerate the pairs, not just count?” → output limit changes everything
- “What if the array streams in?” → online algorithm needed
For prompt #9 (a, b ≤ 10^18):
- “What if
pis not prime?” → can’t use Fermat’s little theorem inverse - “What if we need
a^bexactly (no mod)?” → impossible for these sizes
Product Extension
In production, “constraint” often means “expected QPS × max payload size × peak time”. A request handler that’s O(payload_size^2) is fine for size 10 but catastrophic for size 10^4. Same intuition as competitive judges, just different vocabulary.
Language/Runtime Follow-ups
- Python: constant factor ~30–100× slower than C++. Halve your effective budget.
N=10^5with O(N^2) is risky in Python. - Java: ~2–4× slower than C++. Beware autoboxing in hot loops.
- Go: ~2× slower than C++. Map operations have higher constants than
unordered_map. - C++: baseline. Use
ios_base::sync_with_stdio(false)for fast I/O. - JavaScript: ~3–10× slower than C++. Avoid object lookups in hot loops; prefer typed arrays.
Common Bugs
- Forgetting Q queries multiplies your budget (10^5 queries × 10^5 per-query = 10^10!)
- Forgetting T test cases (e.g., T = 10^4 with O(N^2) per test, N = 10^3 → 10^10)
- Underestimating constants in Python/JS
- Forgetting recursion depth limits (Python default 1000)
- Forgetting integer overflow at int32 boundary (~2 × 10^9)
Debugging Strategy
When code TLEs:
- Recompute total operations: outer loops × inner work × test cases × queries
- Check the constant factor (string concat in a loop is a classic disaster)
- Profile to find the actual hotspot (often I/O, not algorithm)
Deliverable
For each of the 10 prompts above, write the structured answer block. Compare yours to the table at FRAMEWORK.md.
Mastery Criteria
- Correctly mapped 10/10 prompts to budget within 30 seconds each
- Identified at least one disqualifying approach for each (per query / total ops)
- Recognized the multiplier effect of T test cases / Q queries
- Identified at least 2 prompts where Python/JS would need extra care