p97 — Text Justification

Source: LeetCode 68 · Hard · Topics: String, Greedy, Simulation Companies (2024–2025): Google (high), Airbnb (high), Amazon (medium), LinkedIn (medium) Loop position: The “implementation Hard” — algorithmically trivial (greedy line-packing) but brutal on detail. Tests precision under pressure.

1. Quick Context

Given a list of words and a target width maxWidth, format each line:

  • Pack as many words per line as fit (with single spaces between).
  • Distribute extra spaces evenly among gaps; if uneven, left-leaning slots get one extra.
  • Last line is left-justified (single spaces between words, trailing spaces to fill width).
  • A line with one word is left-justified.

Output: list of strings, each exactly maxWidth characters.

🔗 https://leetcode.com/problems/text-justification/

STOP. 35-min timer.

3. Prerequisite Concepts

  • Greedy line packing.
  • Integer division + remainder for distribution.
  • String join.

4. How to Approach (FRAMEWORK 1–9)

1. Restate: Format paragraphs with full justification, last line left-justified.

2. Clarify: Word longer than maxWidth? LC says guaranteed not. Empty words? No.

3. Examples:

words = ["This","is","an","example","of","text","justification."], maxWidth = 16
→ ["This    is    an", "example  of text", "justification.  "]

5. Brute: Brute force isn’t really meaningful here — the problem IS the simulation. Implement straightforwardly; verify with a property-based oracle that re-parses output.

6. Target: O(total chars) time.

7. Pattern: Two phases: (a) greedy pack to find each line’s word range; (b) format each line with spacing rules.

8. Develop: see solution.py.

9. Test: single short line; line with one long word; last line as middle-line edge; exact-fit (sum of words + (k-1) spaces == maxWidth); maxWidth = 1; word.length == maxWidth.

5. Progressive Hints

See hints.md.

6. Deeper Insight

6.1 Line-packing greedy

Maintain line_len = sum of word lengths in the current line + (count - 1) for mandatory single spaces. When adding word w would exceed maxWidth (line_len + 1 + len(w) > maxWidth), close the current line.

This greedy is optimal for the LC problem definition. (Note: Knuth-Plass justification — used in TeX — is a global DP optimizing badness; not what LC asks.)

6.2 Even-with-left-leaning space distribution

For a line of k words with total_word_chars total characters and gaps = k - 1 gaps:

  • Total spaces to distribute: maxWidth - total_word_chars.
  • Base spaces per gap: total_spaces // gaps.
  • Leftover: total_spaces % gaps. Add one extra to the first leftover gaps.

Edge: k == 1 → no gaps; the whole maxWidth - total_word_chars becomes trailing space.

6.3 Last line — special-cased

Last line is always left-justified: words joined by single spaces, then padded with trailing spaces to width. Same rule applies if k == 1 on any line (whether last or not — but for the last line, the rule is uniform).

6.4 Why this is “easy-Hard”

Algorithm is trivial. The Hard rating comes from edge case density: 5 special cases in 30 lines of code, each silently breakable. Interview signal is precision and discipline, not insight.

6.5 Family: text layout / line breaking

  • LC 68 Text Justification (this).
  • LC 1531 String Compression II.
  • LC 418 Sentence Screen Fitting.
  • LC 65 Valid Number (more parsing, similar precision bar).
  • Knuth-Plass (TeX) line breaking — DP minimizing sum of squared “badness”.

7. Anti-Pattern Analysis

  1. Forget last-line is left-justified — fails on the trailing-words test.
  2. Forget single-word line is left-justified — common bug.
  3. Distribute extra space evenly using // without % — last gap gets all the slack, looks wrong.
  4. Use ' ' * (maxWidth - line_len) between every word — collapses spaces incorrectly.
  5. Use ' '.join() then pad — works for last line only; not for full justification.
  6. Track line as a string and append — works but harder to reason about; track word indices instead.
  7. Off-by-one when checking fit — must account for the implicit space between current last word and the new word.

8. Skills & Takeaways

  • Two-phase design: split into “decide line boundaries” + “format one line”.
  • Distribute remainder via divmod.
  • Implementation Hard requires checklists, not insight.

9. When to Move On

  • Implement in 30 min cold with no off-by-ones.
  • Cite Knuth-Plass as the global-optimization variant.

10. Company Context

  • Google: L4/L5; classic phone screen at this rating.
  • Airbnb: loves this as a precision check.
  • Amazon: L5.

Strong: “Greedy line-packing (decide where each line ends), then format: distribute maxWidth - total_chars spaces across k - 1 gaps with divmod, left-leaning extras. Last line and single-word lines are left-justified.”

11. Interviewer’s Lens

SignalJuniorMidSeniorStaff+
PackAfter hintColdColdCold
Even distributeBuggyFix afterCold+ divmod-clean
Last lineForgetsFixesColdCold
Single-wordForgetsForgetsCatchesCatches
FamilyNoneNoneNone+ Knuth-Plass

12. Level Delta

  • Mid (L4): Works with 1-2 bug fixes in interview.
  • Senior (L5): Works first try.
  • Staff (L6): + clean two-phase decomposition + cites Knuth-Plass.
  • Principal (L7+): + recognizes as line-breaking subproblem in document layout engines (TeX, browsers’ text shapers — HarfBuzz + ICU); + production tradeoff: greedy is fast and predictable; Knuth-Plass produces better typography but is O(n²); browsers use greedy for speed.

13. Follow-up Q&A

Q1: Minimize the total “badness” across all lines (Knuth-Plass). A1: DP over word indices; dp[i] = min total badness ending line at word i; transition over previous line-start j; O(n²). Used by TeX.

Q2: Right-justify instead of full-justify. A2: Leading spaces instead of trailing; same packing.

Q3: Hyphenation across lines. A3: Pre-compute hyphenation points (TeX dictionary); add hyphen as a virtual word boundary with associated badness.

Q4: Streaming — receive words one at a time, emit lines. A4: Maintain current line buffer; emit when next word doesn’t fit.

Q5: Variable-width fonts (rendering). A5: Width function maps char → pixels; same algorithm but with float widths and a budget.

14. Solution Walkthrough

See solution.py:

  • full_justify — greedy pack + format.
  • _format_line — single-line formatter with all special cases.
  • Verifier function that re-parses output to confirm round-trip invariants.

15. Beyond the Problem

Principal code review: “Text Justification is document layout in miniature. The greedy approach used here is what browsers, IDEs, and console text renderers use because it’s O(n) and predictable. The Knuth-Plass approach (TeX, 1981) treats line breaking as global optimization: define a ‘badness’ metric per line (deviation from ideal stretch), then minimize the sum across all lines via DP. The result is dramatically better typography — TeX-set documents are visually superior to browser-rendered ones precisely because of this. The Staff+ insight is recognizing the tradeoff: greedy is O(n) and locally optimal; DP is O(n²) but globally optimal. Production systems make this choice deliberately: web browsers (Chrome, Safari) use greedy for performance; PDF generators (LaTeX, ConTeXt) use Knuth-Plass for quality; the in-between (InDesign) lets users choose. The deeper lesson: ‘how good is good enough’ is an engineering judgment, and being able to articulate the spectrum of options between O(n) greedy and O(n²) global-DP is what separates Staff+ from Senior. The same tradeoff appears in compiler instruction scheduling, query planner cost-based vs heuristic optimization, and ML hyperparameter search.”