Phase 8 — Practical Engineering Coding Interviews

Target level: Medium-Hard → Hard (senior+ practical interview track) Expected duration: 4 weeks Weekly cadence: 5–6 labs/week, with each lab requiring a complete working implementation, tests, and rehearsed answers to follow-ups Companies this targets: Big Tech L5+ (Google L5/L6, Meta E5/E6, Amazon SDE-III/Principal, Microsoft Sr/Principal), Stripe, Uber, Airbnb, Cloudflare, Datadog, Snowflake, Databricks, infrastructure-heavy startups


Why This Phase Exists

Phase 2 through Phase 7 trained you to recognize patterns and produce optimal algorithms under a stopwatch. That training is necessary and remains the gating function for the first 30 minutes of most rounds. But there is a second, distinct kind of coding interview that you will face starting at the senior level (and at every level at companies like Stripe, Airbnb, and Uber where the engineering bar is calibrated against production code rather than against contest performance).

That second kind of interview is the practical engineering coding round. You are asked to “build an LRU cache”, “build a rate limiter”, “build a thread pool”, “build a job queue”, “build a small in-memory filesystem”. The problem is not algorithmically extreme — most of these have textbook solutions you could find in a CS curriculum. What the interviewer is testing is whether your code looks like production code:

  • Are your data structures encapsulated behind a clean API?
  • Are mutations and reads separated cleanly?
  • Are concurrency invariants explicit, or did you sprinkle locks “just in case”?
  • Do you handle partial failure, shutdown, and resource cleanup?
  • Did you write tests that actually exercise the contract — including concurrency tests where relevant?
  • Can you answer the inevitable follow-ups about scaling, observability, and operational concerns?

Candidates from a pure LeetCode background routinely fail this round. They produce a one-function LRUCache that passes the LC test cases, then freeze when the interviewer asks “how would you make this thread-safe?” or “how would you observe this in production?” or “what would you do if a put could fail mid-operation?” The interviewer’s note reads: “Strong on the algorithm, weak on engineering. No-hire for senior.”

The bar at senior+ practical interviews is not “did you write code that produces the right answer”. The bar is “did you write code that I would be willing to deploy”. Those are different bars, and this phase trains the second one explicitly.


What Makes Practical Problems Different From LeetCode

DimensionLeetCode-stylePractical engineering
Optimization targetBig-O time, sometimes spaceAPI surface, testability, operational fitness, correctness under concurrency
Code length20–40 lines100–400 lines (a class with several methods + tests)
StateLocal to a functionOwned by a long-lived object with invariants across calls
ConcurrencyAlmost never testedAlmost always at least raised as a follow-up
Failure modes“Wrong answer on test 47”Partial failure, restart, poison input, backpressure, shutdown
TestsProvided by the judgeYou write them
Follow-upsVariant problems with tweaked constraintsOperational reality questions (“scale to N nodes”, “persist across restarts”)
Bar for excellenceOptimal complexityProduction readability + correctness + answers all follow-ups crisply

A LeetCode answer that nails the algorithm but ships a 60-line wall-of-code with single-letter variables and no separation of concerns will get a no-hire at the senior bar even when the algorithm is correct. Conversely, a practical answer that is a little slower than optimal but is cleanly structured, well-tested, and accompanied by sharp follow-up answers will get a strong hire.

You will not “see” this difference until you’ve practiced enough practical labs to internalize what “clean” looks like at the senior bar. That internalization is the entire point of this phase.


The 13 Standard Follow-Ups

Every problem in this phase will be followed by a subset of these thirteen questions. They are not problem-specific — they are senior-bar questions that recur across the industry. Memorize the question list. Then, for each lab in this phase, rehearse the answer for the 4–6 follow-ups that are most natural for that problem. By the end of Phase 8 you should be able to give a 60-to-90-second answer to any of these for any data structure or service-shaped object you’ve built.

  1. How would you make it thread-safe? Identify the critical sections, choose between coarse-grained mutex / fine-grained locks / lock-free / CAS / sharded locks, justify the choice, name the failure modes the choice avoids (deadlock, lost update, torn read), and state the contention behavior under load.
  2. How would you persist state across restarts? Pick between full snapshot, log-structured append (write-ahead log), and snapshot+log; address durability (fsync), atomicity (rename or checksum), and recovery (replay log on boot). State the time-to-recover and the worst-case data loss window.
  3. How would you scale to N nodes? Decide between sharding (partition the keyspace), replication (read scaling), and routing (consistent hashing + virtual nodes). Address rebalancing, hotspotting, and cross-node operations. Don’t reach for “distribute everything” — most practical objects scale by sharding.
  4. How would you observe and monitor it? Name the four signals (latency, traffic, errors, saturation — Google’s Golden Signals) and state which metric you’d emit for each. Specify whether you’d export histograms (latency), counters (events), or gauges (queue depth). Describe the dashboard you’d build.
  5. How would you test it? Three layers minimum: unit tests on each method’s contract; integration / smoke tests on end-to-end flows; concurrency / stress tests where multiple goroutines or threads exercise the object. Mention property-based testing where invariants are clean.
  6. What metrics would you emit? Per-operation counters (puts, gets, hits, misses); per-operation latency histograms; queue / cache size gauges; failure-class counters (eviction, timeout, retry, poison). Reject the temptation to emit everything — emit what you’d actually look at on a 3 AM page.
  7. How would you handle backpressure? Decide between blocking the producer, dropping the request, returning an error, or buffering with a bounded queue and rejection policy. State which one you chose and why. The wrong answer here is “we’d have a really big buffer” — that just delays the problem and worsens latency.
  8. How would you handle partial failure? Identify which operations can fail mid-way (a write that succeeds locally but fails to persist; a network call that times out without confirmation). Choose between idempotent retry, two-phase commit, log-and-recover, or just-fail-fast. Don’t reach for “transactions” reflexively — pick the tool that matches the problem.
  9. What is the eviction policy and cleanup strategy? For caches: LRU / LFU / TTL / size-bounded. For queues: drop oldest / drop newest / dead-letter. For background state: TTL + scavenger goroutine. State the worst-case eviction storm.
  10. What is the consistency model? Strong (linearizable), sequential, causal, eventual, monotonic-read. Most in-memory single-process objects are linearizable trivially; the question becomes interesting once replicated. Be precise about what guarantees you offer.
  11. What configuration knobs would you expose? Capacity, TTL, retry count, backoff base, concurrency limit, shutdown timeout. State sensible defaults. Critically: state the knobs you would not expose, because over-configuration is its own production smell.
  12. What is the shutdown / draining behavior? On close() / SIGTERM: stop accepting new work, finish in-flight work up to a deadline, persist or surface anything not finished, release resources. Specify the deadline. Specify what happens when the deadline expires.
  13. How would you handle a poison-pill input? A request that crashes the worker, exhausts memory, or causes an infinite loop. Bound resource usage per request, isolate the worker, route repeat-offending payloads to a dead-letter queue, and emit a metric. Never silently drop them.

For each lab, the Follow-up Questions section selects 4–6 of these and rehearses an answer. Memorizing one bullet per question is not enough — you need to be able to converse about the choice, naming alternatives and tradeoffs.


Implementation Discipline Expected In This Phase

This is the heaviest phase by code volume. Every lab demands a complete working implementation, not pseudocode and not a sketch. The bar is “could a coworker submit this for code review without me being embarrassed?”. Concretely:

  • Idiomatic in the chosen language. Python uses snake_case, dataclasses where natural, with blocks for locks, asyncio where the lab demands async. Java uses camelCase, prefers java.util.concurrent primitives, declares interfaces. Go uses short receiver names, returns errors as last value, prefers channels for fan-out, mutexes for shared state.
  • Small functions, one concern per function. A method that does both validation and mutation should be split. The exception is hot-path code where inlining matters; if you inline, leave a one-line comment explaining why.
  • Names that describe intent, not type. evict_lru() not e(), pending_jobs not pj, acquire_token_or_block(timeout) not take().
  • Separation of concerns. Storage, eviction policy, concurrency primitives, observability hooks, and configuration are all distinct concerns. Most labs in this phase have natural seams between them — find the seams and respect them. A class that mixes “manages state”, “decides policy”, and “emits metrics” in every method is harder to test than three classes that compose.
  • Testable design. Every public method has an obvious test. Constructors take their dependencies (the eviction policy, the clock, the metrics emitter) as parameters so tests can inject fakes. Hardcoded time.now() calls inside business logic are a code smell — inject a clock.
  • Explicit error handling. Every external call has a defined behavior on failure. Silent try/except: pass is forbidden unless accompanied by a comment explaining why the exception is benign.
  • Concurrency invariants documented. If a class is thread-safe, say so in the docstring and name the lock that guards each field. If a class is not thread-safe, say so. The forbidden state is “it might be thread-safe, the author didn’t think about it”.
  • No premature abstraction. Two implementations of an interface justify the interface. One implementation does not. Don’t add a Storage interface for the in-memory backing map until you actually have a second backing.

The labs do not enforce a single language across the phase. Pick Python, Java, or Go for each lab based on what feels natural. Most candidates default to Python because the standard library is rich and the syntax is dense; Java is a strong choice when concurrency and java.util.concurrent primitives are at the heart of the problem (thread pool, blocking queue, atomic counters); Go is excellent when the problem is naturally concurrent and channel-shaped (job queue, dispatcher, crawler). For each lab, the Language/Runtime Follow-ups section calls out the right idiomatic choice in each of the major languages.


The 23 Labs

#LabCore Idea (one line)
01LRU CacheO(1) get/put via doubly-linked list + hashmap; the canonical practical-coding warmup
02LFU CacheFrequency-bucketed eviction; tie-breaking by recency; harder than LRU
03Rate LimiterFour algorithms compared: token bucket, leaky bucket, sliding window log, sliding window counter
04Task SchedulerPriority-aware task scheduling with retries, backoff, and a dead-letter queue
05Thread PoolBounded worker pool with work queue, graceful shutdown, and rejection policy
06Durable Job QueueAt-least-once delivery semantics with idempotency keys and ack/nack
07AutocompleteTrie + per-prefix top-K with weighted scores and sub-millisecond response
08Log ParserStreaming log line parser with regex extraction and bounded memory
09File DeduplicationThree-stage pipeline: size → quick hash → full hash
10Consistent HashingHash ring with virtual nodes, minimal key movement on add/remove
11Message DispatcherFan-out to N consumers with fairness, priority, and per-consumer backpressure
12Pub/SubIn-memory topic-based publish/subscribe with wildcard subscriptions
13Timer WheelHierarchical timer wheel for O(1) amortized timer scheduling
14Key-Value StoreIn-memory KV with TTL, snapshot+WAL persistence, and crash recovery
15Retry With BackoffExponential backoff + decorrelated jitter + max-attempts + retryable-error policy
16Circuit BreakerThree-state machine: closed / open / half-open with sliding-window failure counting
17Metrics CollectorCounter / gauge / histogram with bounded memory and atomic updates
18Web CrawlerConcurrent crawler with depth limit, politeness (per-host throttle), and dedup
19In-Memory Filesystemls, mkdir, addContentToFile, readContent over a tree of inodes
20Snake GameState machine + collision detection + score; classic OOD round
21Tic-Tac-Toe StreamingO(1) winner check by maintaining row/col/diagonal counters
22Text Editor BufferGap buffer / piece table for cursor-local edits; the canonical editor data structure
23SQL-Like EngineToy parser + executor for SELECT … FROM … WHERE … JOIN … over in-memory tables

The order is not arbitrary. Labs 1–6 are the canonical warmups (LRU is asked at every senior interview that uses this format). Labs 7–14 stretch into harder data-structure and operational territory. Labs 15–17 are pure operational primitives (retry, circuit breaker, metrics) that show up in service-design rounds. Labs 18–23 are larger, more open-ended OOD-style problems where the interviewer wants to see how you decompose a fuzzy problem into classes.

If you have a 4-week schedule, do six labs per week with a buffer day for the final lab and a mock-interview rehearsal. If you have an 8-week schedule, do three per week and spend the extra time on the follow-ups — that’s where senior interviews are won and lost.


Mastery Checklist

You have completed Phase 8 when you can do the following without prompting:

  • Implement LRU cache with thread-safety in <20 minutes from a blank screen, including a unit test suite that exercises eviction order.
  • Implement LFU cache with correct tie-breaking in <30 minutes.
  • Compare the four rate-limiting algorithms verbally and justify the right pick for a stated load profile in <2 minutes.
  • Implement a thread pool with bounded queue, rejection policy, and graceful shutdown in <30 minutes.
  • Implement a job queue with at-least-once semantics and explain why exactly-once is impractical in <2 minutes.
  • Implement an in-memory KV store with TTL eviction in <25 minutes.
  • Implement a circuit breaker with all three states and explain when half-open transitions back to closed in <2 minutes.
  • Implement consistent hashing with virtual nodes in <30 minutes and explain the rebalancing cost on add/remove.
  • For any of the 23 problems, answer all 13 standard follow-ups crisply (60–90 seconds each) without notes.
  • Identify, for any production object you’ve built (in real work or in this phase), the four Golden Signals you’d emit and justify why those four.
  • State the consistency model of any data structure you’ve built in one sentence.
  • Write a stress test for a concurrent data structure that actually finds bugs (i.e., randomly interleaves operations across threads, asserts invariants after, replays the seed on failure).
  • Refactor one of your own LeetCode-style 50-line answers from any earlier phase into a clean, testable, production-shaped class without consulting any reference.

Exit Criteria

You may exit Phase 8 and move on to Phase 9 — Language & Runtime Deep Dive when:

  1. Lab completion: every one of the 23 labs has been implemented and tested by you, in a language you would actually use at work, with the test suite passing on the first run after a 24-hour gap (no peek-and-debug). The 24-hour gap matters — it tests retention, not short-term memory.
  2. Follow-up fluency: you can answer the 13 standard follow-ups without prompts for at least 18 of the 23 labs.
  3. Mock interview: you have done at least 2 mock interviews drawn from this phase’s problem list (Phase 11 — Mock Interview Mastery) with a passing rubric score on both, where “passing” requires hitting both the algorithmic correctness and the production-readiness rubric dimensions.
  4. Code review readiness: you can take any of your Phase 8 implementations, post it as a hypothetical PR, and write the PR description (motivation, design choices, tradeoffs, test plan) in <10 minutes per implementation.

If any of the four criteria fail, do not move on. Most candidates underestimate (3) — they pass the algorithm dimension but fail the production-readiness dimension because they didn’t rehearse the follow-ups out loud. Read COMMUNICATION.md once more, then re-do the mocks. The mocks are not optional; the practical-engineering bar is calibrated against verbalized reasoning, not solo-coded artifacts.


Cross-References

  • FRAMEWORK.md — the universal 16-step framework still applies. Practical problems extend step 16 (production implications), not replace steps 1–15.
  • CODE_QUALITY.md — the bar is enforced here more strictly than anywhere else in the curriculum.
  • phase-03-advanced-data-structures/ — several labs (LRU, LFU, trie) build on data structures introduced there. If you skipped Phase 3, do at least labs 7 and 8 of that phase before starting here.
  • phase-04-graphs/ — the consistent-hashing and dispatcher labs share modeling instincts with graph problems.
  • phase-09-language-runtime/ — the next phase. Practical engineering interviews and runtime interviews are deeply intertwined; many follow-up answers in Phase 8 cite runtime facts you’ll formalize in Phase 9.
  • phase-11-mock-interviews/mock-08-staff-practical.md is built around this phase’s problem list.