p16 — Validate Binary Search Tree

Source: LeetCode 98 · Medium · Topics: Tree, DFS, BST Companies (2024–2025 frequency): Amazon (very high), Microsoft (very high), Meta (high), Google (high), Bloomberg (high — phone screen staple), Apple (medium), Adobe (medium) Loop position: first round in 80% of phone screens that include a tree question; the canonical “you’ll pick the wrong solution first” trap.

1. Quick Context

This is the single most well-known booby-trap problem in tree interviews. The naive solution — “check node.left.val < node.val < node.right.val at every node” — looks correct, passes 4 of LC’s 5 sample cases, and is wrong on the case [5,1,4,null,null,3,6] (which has 3 < 4 < 6 locally but 3 < 5 globally — violating BST).

The interview value is in the candidate’s diagnostic process:

  • Do you immediately write the wrong solution? (most do)
  • Do you catch it yourself via the right example? (top-30%)
  • Can you articulate the BST invariant correctly before coding? (top-10% — this is the move that flips the round)

Two clean solutions exist:

  1. Bounds-passing (top-down): carry (lo, hi) open intervals into each recursive call; at the leaf, the constraints from ancestors are baked in.
  2. Inorder monotonicity (bottom-up): inorder of a valid BST is strictly increasing; do an iterative inorder and check each successive pair.

Both are O(N). Both are interviewer-acceptable. Knowing why bounds-passing generalizes (any monotone predicate on the path from root to leaf) and why inorder generalizes (any property reducible to ordered iteration) is the Senior-level conversation.

What it looks like it tests: can you traverse a tree. What it actually tests: can you state an invariant precisely BEFORE coding, and can you spot the difference between a local check and a global check.


🔗 https://leetcode.com/problems/validate-binary-search-tree/

STOP. 8-min timer. If you write node.left.val < node.val < node.right.val and submit — that’s the lesson. Build the test case [5,1,4,null,null,3,6] by hand and trace through your code.


3. Prerequisite Concepts

  • p12, p13 — recursive tree walks that return booleans
  • The notion of an invariant that holds for every subtree (not just immediate children)
  • Inorder traversal (recursive AND iterative — see phase-01-foundations/labs/lab-09-tree-traversal-fundamentals.md)
  • Python’s float('inf') / float('-inf') sentinels or None checks for unbounded sides

4. How to Approach (FRAMEWORK Steps 1–9)

Step 1 — Restate: “Given the root of a binary tree, return True iff it is a valid BST. A valid BST requires that for EVERY node, ALL values in its left subtree are strictly less than the node’s value AND ALL values in its right subtree are strictly greater. Strict inequality — duplicates are not allowed.”

Step 2 — Clarify:

  • “Strict or non-strict?” (Per LC: strict. State this explicitly — some org’s internal BSTs allow duplicates on one side.)
  • “Empty tree — is it a valid BST?” (Yes. Vacuously true.)
  • “Integer range?” (LC: −2^31 to 2^31 − 1. Important: this means float('-inf') is safe as a sentinel BUT in some languages you’d need a special tri-state.)
  • “Single node?” (Valid. Trivially BST.)
  • “Should I assume the tree is balanced?” (No. Worst case is depth N.)

Step 3 — Constraints: N up to 10^4. Values fit in int32. O(N) easily achievable; the goal is correctness, not asymptotic improvement.

Step 4 — Examples:

  • [2,1,3] → True.
  • [5,1,4,null,null,3,6]False (this is THE test case — say it aloud).
  • [1] → True.
  • [1,1] → False (strict; duplicate).
  • [2147483647] → True (boundary; sentinel-safe).
  • Empty → True.

Step 5 — Brute Force: At every node, scan its entire left subtree (must all be < node.val) and entire right subtree (must all be > node.val). O(N²). State this then optimize.

Step 6 — Complexity (optimal): O(N) time, O(H) stack space (recursive bounds OR iterative inorder).

Step 7 — Pattern Recognition:

  • Bounds-passing = “pass-down-constraint” recursion (Path: ancestors → descendants). Same shape recurs in segment-tree updates, interval DP, and Knapsack-bounded.
  • Inorder monotonicity = “tree-as-sorted-sequence” trick. Same idea unlocks p18, LC 99, LC 230, LC 285.

Step 8 — Two distinct optimal solutions:

  • Solution A (bounds): recursive validate(node, lo, hi) where root starts with lo=None, hi=None. Each left call tightens hi = node.val; each right call tightens lo = node.val. Check lo < node.val < hi.
  • Solution B (inorder): maintain prev (the last value seen). On each inorder visit, assert prev is None or prev < node.val, then prev = node.val. Use iterative inorder for the stack-safe production version.

Step 9 — Prove correctness (bounds version):

  • Invariant maintained: when validate(node, lo, hi) is called, every value on the path from root to node constrains node.val such that lo < node.val < hi is necessary. The recursion enforces this at every node.
  • Sufficiency: if the invariant holds at every node, then for any pair (a, b) with a in left subtree of b, a.val < b.val holds because a was visited with hi <= b.val. Symmetric for right subtree. ∎

5. Progressive Hints

If stuck for 8 min, open hints.md. Don’t peek before constructing the [5,1,4,null,null,3,6] counterexample by hand.


6. Deeper Insight — Why “local check” fails

The reason node.left.val < node.val < node.right.val is wrong is transitivity is not local. The BST property is a path-cumulative constraint: the value of any descendant on the left of an ancestor must be less than that ancestor’s value, no matter how far down.

Picture [5,1,4,null,null,3,6]:

      5
     / \
    1   4
       / \
      3   6

At node 4: locally 3 < 4 < 6 ✓. But 3 is in the right subtree of 5, so 3 > 5 must hold. It doesn’t. The local check has no visibility into ancestor 5’s constraint.

Bounds-passing fixes this by making the ancestor’s constraint travel with the recursion. Inorder fixes this by linearizing the tree to a sequence where every pair-violation is adjacent.

The deeper takeaway: whenever a tree problem asks about a property that involves two non-adjacent nodes, your two go-to tools are (a) pass info down via parameters, or (b) flatten to inorder/preorder/postorder.


7. Anti-Pattern Analysis

  1. Local check (node.left.val < node.val < node.right.val). Wrong; see above. This is the trap.
  2. Using <= instead of <. Off-by-one on duplicates. Accepts [1,1] as a valid BST. Always restate “strict” in Step 2.
  3. Returning early True without recursing into both subtrees. A common bug: if not check_left: return False followed by return True forgets the right subtree.
  4. Using float('inf') in a language where ints can equal it. In Python, fine. In Java/C++, you must use Long.MIN_VALUE / MAX_VALUE or null/Optional, because Integer.MAX_VALUE could legitimately appear as a node value and equal the sentinel. State this trade-off if asked about polyglot implementation.
  5. Inorder solution that builds a full list then checks sorted. Works but uses O(N) extra space. The streaming version with a single prev is O(H) space. Pre-empt by writing the streaming version.

8. Skills & Takeaways

  • Invariant-first thinking. Write the invariant in English before coding. Every BST problem you’ll ever see is a variant of “what does the invariant imply about this question?”
  • Two reusable templates in your toolkit: bounds-passing recursion (for any path-cumulative property) and inorder monotonicity (for any order-reducible property).
  • Strict vs non-strict equality is always worth one explicit clarification question. Cheap signal.
  • Streaming inorder (with a single prev variable) is the right production form. Knowing it cold prevents O(N) extra space in many follow-ups.

9. When to Move On

Move on when:

  • You can write the bounds-passing solution from memory in <3 minutes.
  • You can write iterative inorder from memory in <4 minutes.
  • You can explain — without code — why [5,1,4,null,null,3,6] breaks the local check and which ancestor’s constraint the local check is missing.
  • Your stress test agrees with a brute-force O(N²) reference on 1000 random trees including known-invalid cases.

10. Company Context

Amazon:

  • Loves this for SDE-II and SDE-III phone screens. Look for adversarial: “what if values can be duplicates?” (must switch to non-strict or define which side duplicates go).
  • Follow-up: “what if the tree has 10^9 nodes and doesn’t fit in memory?” → discuss streaming inorder from on-disk B-tree storage. Bar Raiser checks this.
  • Scorecard phrase: “Deep Dive — articulated invariant precisely; identified the local-check trap unprompted.”

Google:

  • Will push on complexity proof. “Prove the bounds approach is O(N) and not O(N log N)” — answer: each node visited once, constants don’t grow with depth.
  • Follow-up: “design validate(node) that also returns the count of nodes” → two-channel pattern (link to p14).
  • Scorecard phrase: “Algorithm — clean invariant, complete proof; clear about strict-vs-non-strict semantics.”

Meta:

  • Often pairs this with LC 99 (Recover BST) in the same round. If you nail p16, expect the recovery follow-up.
  • Watch for “code it iteratively” — if your first pass is recursive, they’ll ask for the iterative inorder version.
  • Scorecard phrase: “Coding Excellence — produced both recursive and iterative solutions; chose the right one for the follow-up.”

Microsoft:

  • Less adversarial; will accept either solution. Will probe: “what if I tell you the tree is BST EXCEPT for two swapped nodes?” → LC 99.
  • Scorecard phrase: “Problem Solving — recognized invariant immediately; handled boundary integer values.”

Bloomberg:

  • Phone-screen staple. Often paired with “implement BST insert” in the same 45 minutes.
  • Will ask “what’s the use case for BSTs at Bloomberg?” — answer: order books, time-series indexing, range queries.
  • Scorecard phrase: “Strong — correct first try; explained BST utility in financial systems.”

Apple:

  • Less common but appears in iOS/macOS frameworks team. Will probe: “Swift uses red-black trees in Dictionary — how does balancing relate to this?”
  • Scorecard phrase: “Solid Fundamentals — connected BST to production data structures.”

11. Interviewer’s Lens

SignalJuniorMidSeniorStaff/Principal
Initial approachWrites local check, doesn’t noticeWrites local check, asked to find bugAsks clarifying questions, writes bounds first tryStates invariant, names both solutions, picks one with rationale
Edge casesForgets empty/singleHandles them after promptLists them upfrontLists them upfront AND argues int-overflow sentinel choice
Code qualityMutable globals, no commentsClean function, no invariant commentINVARIANT docstring, both versionsBoth versions + production note on iterative for depth safety
Follow-up handlingStuck on “what if duplicates”Switches to <= correctlyDiscusses ambiguity, asks which sideDesigns general predicate-validator; generalizes to LC 99
CommunicationSilent codingNarrates afterNarrates whileNarrates intent, complexity, and trade-offs in real time

12. Level Delta

Mid (L4 / SDE-II):

  • Gets the bounds-passing solution after one nudge.
  • Mentions inorder as alternative.
  • Handles strict-vs-non-strict correctly when asked.

Senior (L5 / SDE-III):

  • Writes bounds-passing first try with None sentinels.
  • Writes streaming iterative inorder for follow-up without prompting.
  • Articulates why local check fails using the canonical counterexample.

Staff (L6 / Principal Engineer Candidate):

  • States the invariant in one sentence before touching the keyboard.
  • Implements BOTH solutions; explains when to prefer each (bounds for early termination on adversarial path; inorder for follow-ups asking about successor/k-th smallest).
  • Discusses depth-limit production concerns (recursion → stack overflow) and offers iterative as the deployable form.

Principal (L7+):

  • Generalizes: “the bounds-passing trick is a special case of constraint propagation — same idea is in solvers, type-checkers, interval analysis.”
  • Connects to RB-trees / B-trees / LSM-trees — why “balanced” matters for the BST guarantees to hold under arbitrary insert order.
  • Discusses concurrent BST validation (snapshot isolation? COW?) for distributed systems.

13. Follow-up Q&A

Q1: What if the tree allows duplicates on the left subtree (i.e., left.val <= node.val < right.val)? A: Switch the bounds check to lo < node.val < hi becomes lo <= node.val < hi. For inorder, allow prev == node.val (non-strict monotone). Restate the invariant explicitly: “values in left subtree are ≤ node, values in right are >.”

Q2: How would you implement this iteratively without recursion? A: Iterative inorder with an explicit stack and a prev variable. Push lefts onto stack until null, pop and check against prev, then descend right. O(H) auxiliary space.

Q3: Can you do it without float('inf') sentinels? A: Yes — use None for unbounded sides and check (lo is None or lo < node.val) and (hi is None or node.val < hi). Slightly verbose but language-agnostic.

Q4: How would you modify this to also count the number of invalid nodes (nodes that violate the BST property at some ancestor)? A: Use the two-channel pattern (p14). The recursion returns the validity flag; a closure variable accumulates the count of violators. Requires defining “which node is the violator” — typically the descendant whose value breaks the constraint, not the ancestor.

Q5: A colleague suggests storing (min, max, is_bst) at each node as a return triple and computing bottom-up. Compare with bounds-passing. A: Bottom-up triple is also O(N), but it carries 3 values up instead of pushing 2 down. Pros: no need for sentinels; easier to extend to “find largest BST subtree” (LC 333). Cons: more state per call. Both are interviewer-acceptable; pick based on the follow-up (largest BST subtree → triple; validate only → bounds).


14. Solution Walkthrough

See solution.py:

  • validate_bounds(root) — recursive bounds-passing with None sentinels (the production-recommended form).
  • validate_inorder_iterative(root) — iterative inorder with prev (stack-safe; the form you’d ship).
  • validate_inorder_recursive(root) — recursive inorder with nonlocal prev (cleaner code; same algorithm).
  • validate_brute_n2(root) — O(N²) reference oracle: at every node, scan entire left/right subtree.
  • stress_test() — generates 1000 random binary trees (some valid BSTs, some invalid via swap-mutation); asserts all 4 implementations agree.
  • edge_cases() — empty, single, [5,1,4,null,null,3,6], duplicates, INT_MAX boundary.

Run: python3 solution.py → “ALL TESTS PASSED” on stdout.


15. Beyond the Problem

Principal-engineer code review (verbatim, the kind you’d see on a Phab/Gerrit diff):

“This validation logic is correct but I’d flag two things before merging. First, the recursive form will stack-overflow on adversarial input (any sorted-insertion BST of depth > sys.getrecursionlimit()). For a public API ingesting user-provided trees, ship the iterative inorder version — same O(N), same correctness, but it uses heap not native stack. Second, the use of float('-inf') is fine in Python but if this is ever ported to Java/Rust/Go, the sentinel collision with a legitimate INT_MIN node value will silently corrupt validation. Either parameterize the sentinel type or refactor to the None-based bounds — the latter ports cleanly. Also: name the failing-case unit test test_subtree_violates_ancestor_constraint not test_invalid_tree_1 so the next on-call understands what broke when it regresses.”

This is the level of code-review feedback you should be internalizing — not just “does it work” but “will it break in production, will it port, will the next engineer understand the regression.”

Connections forward:

  • p17 (LCA of BST) exploits the same “BST property as shortcut” lever.
  • p18 (Kth Smallest) uses the inorder monotonicity insight directly.
  • p19 (Sorted Array → BST) is the inverse of inorder.
  • LC 99 (Recover BST) is the canonical “what if inorder finds two anomalies” problem and is a frequent follow-up.
  • The bounds-passing pattern reappears in segment-tree update propagation, interval DP, and constraint solvers.