Mock 08 — Staff Practical

Interview type: Staff/Principal engineer practical coding round Target role: Staff SWE (L6 Google / E6 Meta / Principal Amazon) Time limit: 75 minutes Format: Build a working component (not a LeetCode puzzle) with multiple interacting pieces Hints policy: Hints affect score but rarely fail you outright at staff bar — the bar is judgment, not raw problem-solving. Primary goal: Demonstrate the ability to build a real thing under time pressure, with monitoring/failure-mode awareness baked in.

What This Mock Tests

Staff interviews shift away from “can you solve this puzzle” toward “can you build something we’d ship.” You’re given a problem statement that resembles a small feature spec. Your job:

Decompose the problem into modules with clean interfaces
Choose data structures that match real production constraints
Implement the core fully + skeletons for the rest
Discuss monitoring, deployment, failure recovery, evolution
Justify every choice you make against alternatives

Scoring weights: Code Quality (#7), Tradeoff Reasoning (#13), Production Awareness (#14) are paramount. Pure algorithmic Optimization is less critical — staff problems rarely have a “trick.”

Pick One Build

Build A — In-Memory Rate Limiter Library

Build a usable Python/Java/Go module that provides:

limiter = RateLimiter(max_requests=100, window_seconds=60)
allowed = limiter.allow(user_id="alice")  # bool, ~10µs p99

Required:

Multiple algorithms behind a uniform interface (token bucket + sliding window log + sliding window counter)
Configurable per-user vs global limits
Thread-safe
A stats() method returning rejection rate per user
A purge() method to evict idle users from memory
Tests covering correctness, thread safety, and the window-edge case (burst at exactly t=W)

Build B — Bounded LRU Cache with TTL and Stats

Build an LRU cache that also supports per-entry TTL:

cache = LRUCache(capacity=10000, default_ttl_seconds=300)
cache.put(key, value, ttl=None)   # uses default
val = cache.get(key)              # returns None if missing/expired
cache.delete(key)
cache.stats()                     # hit rate, eviction rate, expiration rate

Required:

O(1) get/put
Lazy + active TTL expiration
Thread-safe
Memory cap (eviction policy: LRU among non-expired)
Tests: correctness, expiration races, concurrent put/get, capacity overflow

Build C — Job Scheduler (cron-like)

Build a scheduler that runs jobs at specified intervals:

scheduler = Scheduler()
scheduler.add(name="cleanup", interval_sec=300, fn=cleanup_fn)
scheduler.add(name="report", cron="0 9 * * MON", fn=report_fn)  # nice-to-have
scheduler.start()
scheduler.stop()
scheduler.status()  # last run, next run, last error per job

Required:

Multiple jobs running independently
Graceful shutdown (don’t kill mid-job)
Per-job error isolation (one job’s failure doesn’t crash the scheduler)
Catch-up policy on missed runs (skip vs catch-up; configurable)
Tests: timing, overlap, panic in job

Expected Communication Style

Restate with assumptions stated upfront: “I’ll assume single-process, multi-threaded, in-memory; if you want me to extend to distributed, that’s a separate discussion.”
Propose the module decomposition before writing any code. Whiteboard the public interface, the internal modules, the data flow.
Identify the 2–3 critical design decisions and discuss alternatives.
Pick one and code it — favor depth over breadth. Skeleton/stub the rest with comments like # TODO: implement token bucket variant with same interface.
Discuss monitoring without prompting: which metrics, why, what alerts.
Discuss failure modes without prompting: thread starvation, memory blowup, race conditions, partial failures.
Test the critical path. Production-style tests, not just smoke.

Solution Sketches

A. Rate Limiter:

class RateLimiter(ABC):
    def allow(self, key: str) -> bool: ...
    def stats(self) -> dict: ...
    def purge(self, idle_seconds: int): ...

class SlidingWindowLogLimiter(RateLimiter):
    def __init__(self, max_requests, window_seconds):
        self._max = max_requests
        self._window = window_seconds
        self._logs = defaultdict(deque)   # key → deque of timestamps
        self._lock = threading.Lock()
        self._rejects = Counter()
        self._accepts = Counter()
    
    def allow(self, key):
        now = time.monotonic()
        with self._lock:
            log = self._logs[key]
            while log and log[0] <= now - self._window:
                log.popleft()
            if len(log) < self._max:
                log.append(now)
                self._accepts[key] += 1
                return True
            self._rejects[key] += 1
            return False

Plus token bucket and sliding-window-counter implementations behind the same interface.

B. LRU + TTL: doubly linked list + hashmap; each node stores (key, value, expires_at, prev, next). get: check expiry, evict if expired, return; else move to MRU. put: insert/update, evict LRU if over capacity. Background thread (or lazy on every get/put) sweeps expired entries.

C. Job Scheduler: thread pool + priority queue of (next_run_time, job). Main loop: peek queue, sleep until next, run job in pool, re-schedule. Catch exceptions per job; record to status. Graceful shutdown: stop accepting new runs, await running ones with timeout.

Common Failure Modes

Built the algorithm without the interface. Staff interviews care about the API as much as the implementation.
No thread safety. Mentioned in the spec; missed → fail.
No mention of monitoring/observability. Critical staff signal.
Used global state. Hard to test, hard to reason about.
Coded all three rate limiter algorithms in 75 min instead of one well + sketches. Depth > breadth.
TTL implementation does periodic full scan. O(N) sweep per second isn’t acceptable; lazy + bounded active sweep is.
Scheduler: jobs share state and races corrupt it. Job functions need to be treated as untrusted code.

Passing Bar

Total score: 56/70 (average 4.0)
Code Quality #7 ≥ 4
Tradeoff Reasoning #13 ≥ 4
Production Awareness #14 ≥ 4
Working core; documented stubs for the rest
At least 3 tests covering: correctness, concurrency, edge timing
Monitoring + failure modes discussed substantively

Follow-up Questions

For A:

Make it distributed. → Redis with Lua atomic ops; or per-shard local limiter + global reconciliation.
Hot-user problem. → Sharded sub-limiters per user; or local L1 cache.
Add quota burst (allow 2× for 5 sec then throttle). → Token bucket with two-tier refill.

For B:

What’s the GC pressure under high churn? → Allocation per put is the cost; pool nodes if hot path.
Persist across restart. → Periodic snapshot to disk; replay log on startup.
Add a probabilistic admission filter (TinyLFU). → Prevent cache pollution from one-hit-wonders.

For C:

Distribute across N workers. → Leader-elected scheduler that dispatches jobs; or per-shard schedulers.
Persistent jobs (survive restart). → Persist queue to durable storage.
Jobs with dependencies. → DAG scheduler; topological execution.
Job retries with exponential backoff. → Per-job retry state machine.

Required Tests

Correctness on the basic case
Thread safety (concurrent calls; assert no double-count, no race-induced overflow)
Timing edges (window boundary, expiration boundary, scheduling drift)
Failure: what happens if the underlying clock jumps backward?
Resource cleanup: after purge / shutdown, no leaked threads or memory

Required Discussion (production)

Cover, at minimum:

Metrics you’d export (Prometheus-style)
Alert thresholds you’d set
Memory cap behavior
Failure modes and recovery
Deployment story (config, rollout, rollback)
Evolution: how would you add a new rate-limiting algorithm? (Should be drop-in.)

Self-Evaluation Template

Mock 08 — Staff Practical
Date: _______
Build: _______
Time: ___ / 75 min

Scores (1–5):
___ Total /70

Critical dimensions:
  Code Quality (#7): ___
  Tradeoff (#13): ___
  Production (#14): ___

Interface designed before coding? Y/N
Monitoring discussed unprompted? Y/N
Failure modes discussed unprompted? Y/N
Thread safety verified by test? Y/N

What I left unfinished (and what I'd do with another hour):

Action item:

What to Do If You Fail

Production score below 4: Build a real version of one of these systems and run it for a week with metrics. Phase 8 has more.
Code quality below 4: Have a senior do a written code review of your submission; act on it.
Tradeoff below 4: For every decision, force yourself to write down at least 2 alternatives and a rejection reason.
Pass twice consecutively before Mock 09.

LeetCode Interview — Extreme Coding