Skip to content

Chapter 33

Correctness Before Cleverness

A technical-foundation chapter on correctness in senior coding interviews, covering invariants, loop correctness, boundary conditions, mutation hazards, integer overflow, nullability, Unicode, duplicates, and invalid input.

Part IV - Coding Interview Mastery Coding fluencyProblem framingExecutionProduction judgmentCommunication and reflection CodingPractical CodingDebuggingSenior Interview 55 min ready
Jump around the book
On this page

What interview reasoning this foundation supports

Correctness is the first senior signal in a coding interview because every optimization depends on it.

Clever code that is wrong is not close to good code. A senior candidate may use advanced patterns, but they do not hide the conditions that make those patterns safe. They state invariants, control boundary conditions, update mutable state deliberately, handle duplicates and invalid input according to the contract, and test the cases most likely to break the solution.

This foundation supports four interview behaviors:

Behavior Correctness reasoning provides
Problem framing A precise contract for what the code must and must not handle.
Algorithm design An invariant that explains why the algorithm works.
Implementation Boundary and state updates that preserve the invariant.
Testing and debugging Targeted examples that reveal broken assumptions.

The point is not to write slow code. It is to avoid optimizing a solution whose behavior is not yet trustworthy.

Core concepts

Invariants

An invariant is a statement that remains true at a specific point in the algorithm. It connects code mechanics to correctness.

Examples:

Pattern Invariant
Sliding window The current window satisfies the required condition, such as no duplicates or sum within a limit.
Binary search If the target exists, it remains inside the current search range.
Merge intervals The output contains merged, non-overlapping intervals for all intervals processed so far.
BFS shortest path Nodes are visited in nondecreasing distance from the source.
Dynamic programming Each state stores the correct answer for a defined subproblem.
Backtracking The current path represents exactly the choices made along the active recursion branch.

An invariant should be concrete enough to test. “The algorithm is correct” is not an invariant.

Boundary conditions

Most live coding bugs occur at the edges:

  • empty input;
  • one element;
  • two elements;
  • first or last index;
  • inclusive versus exclusive endpoints;
  • duplicates;
  • negative values;
  • maximum values;
  • all values equal;
  • no valid answer;
  • multiple valid answers;
  • disconnected graph;
  • cyclic graph;
  • deeply skewed tree.

Boundary conditions are not afterthoughts. They determine loop ranges, comparison operators, base cases, and return values.

Loop correctness

A loop is correct when initialization, maintenance, and termination all preserve the intended result.

For a two-pointer loop:

  • Initialization: left and right start at valid positions.
  • Maintenance: each pointer movement preserves the invariant or deliberately restores it.
  • Progress: at least one pointer moves toward termination each iteration.
  • Termination: when the loop exits, all relevant candidates have been considered.

For binary search, the loop condition and midpoint update must match the interval convention:

  • closed interval: left <= right, update right = mid - 1 or left = mid + 1;
  • half-open interval: left < right, update right = mid or left = mid + 1.

Mixing conventions causes off-by-one bugs that tests often miss unless you include one- and two-element cases.

Mutation hazards

Mutation is not bad. Uncontrolled mutation is bad.

Common hazards:

  • storing a reference to a list that later changes;
  • mutating input when the caller expects it preserved;
  • sharing visited state across independent branches;
  • reusing a default mutable value across calls;
  • removing from a collection while iterating over it;
  • updating one piece of state without updating its companion;
  • sorting input in place without permission;
  • using global state that survives between test cases.

The senior move is to name ownership:

“I will mutate the visited set as traversal state, but I will not mutate the input grid. If mutation were allowed, I could mark cells in place to save memory.”

Integer overflow and numeric correctness

Some languages have arbitrary-precision integers; others have fixed-width integer overflow. Even when overflow is not likely in the interview environment, senior candidates notice numeric limits.

Risk areas:

  • computing midpoint as (left + right) / 2 in fixed-width integer languages;
  • multiplying large dimensions or counts;
  • using int for sums that may exceed input value range;
  • using floating point for exact comparisons;
  • relying on negative modulo behavior without checking language semantics;
  • choosing sentinel values such as -1 or Integer.MAX_VALUE when those values may be valid input.

Safer midpoint:

mid = left + (right - left) / 2

Safer sentinel strategy: track presence explicitly when the domain can contain the sentinel.

Nullability and missing values

Absence must be represented deliberately.

Distinguish:

  • null input versus empty input;
  • missing map key versus key with zero value;
  • empty string versus absent string;
  • optional field versus required field;
  • leaf child absent versus child node with value zero;
  • no answer versus answer at index 0.

A common bug is treating truthiness as presence. If 0, false, or "" can be valid, use an explicit containment or null check.

Unicode and string representation

String prompts often pretend characters are simple. Production strings are not always simple.

Possible units:

  • bytes;
  • code units;
  • Unicode code points;
  • grapheme clusters visible to users.

For many interview problems, the interviewer expects ordinary character iteration. Senior candidates still ask when it matters:

“Should I treat characters as simple code points for this problem, or do user-visible grapheme clusters matter?”

Do not derail a routine prompt with a Unicode lecture. Do mention representation when the prompt involves user-visible text, length limits, normalization, case folding, or international input.

Duplicates

Duplicates change data structure choice and correctness.

Examples:

  • A set loses multiplicity; use a count map when counts matter.
  • Two-sum with duplicate values needs distinct indexes.
  • Removing duplicates from sorted data requires careful write-index handling.
  • Top-k frequency needs tie-breaking rules.
  • Graph nodes may have duplicate values but distinct identities.

Ask whether duplicates are possible if the prompt does not say. If they are possible, include at least one duplicate test.

Invalid input

Interview prompts vary. Some guarantee valid input; others expect validation.

Clarify:

  • Can input be null?
  • Can intervals have start greater than end?
  • Can graph edges reference missing nodes?
  • Can strings contain invalid encoding?
  • Should the function raise, return an error value, or ignore invalid records?

If the prompt guarantees validity, do not overbuild validation. If the prompt is practical coding or production-oriented, validation may be a core requirement.

Decision rules and trade-offs

State the contract before optimizing

Before improving complexity, know what behavior must be preserved:

  • Is any valid answer acceptable, or must it be stable and deterministic?
  • Is output order significant?
  • Are inputs mutable?
  • Are duplicate values distinct records?
  • Are invalid inputs possible?
  • What should happen when there is no solution?

Optimization that changes the contract is a regression, not an improvement.

Prefer a simple correct baseline when the invariant is unclear

If you cannot state why the clever approach works, use a simpler approach to establish behavior.

Example: For “k closest points”, sorting all points by distance is O(n log n), simple, deterministic, and easy to test. A heap or quickselect can improve asymptotics, but each adds correctness risks: comparator direction, tie handling, expected versus worst-case behavior, and mutation of the input. If constraints allow sorting, it may be the right interview choice.

Make illegal states hard to represent

Choose data structures that encode the rule:

  • count map instead of set when multiplicity matters;
  • enum or explicit status instead of magic strings;
  • tuple (distance, node) when priority and payload must stay together;
  • half-open interval convention [start, end) consistently across code;
  • separate found boolean instead of sentinel when sentinel can collide with real data.

The representation should reduce the number of correctness checks you must remember manually.

Many bugs appear when state variables describe the same concept but are updated separately.

Examples:

  • moving a sliding-window pointer without removing counts;
  • popping a stack without updating a min-stack companion;
  • changing parent pointers without changing child references;
  • marking a graph node visited after enqueue instead of before, causing duplicates;
  • updating best answer before restoring the invariant.

When two variables form one logical state, keep their updates adjacent and narrate the invariant they preserve.

Copy only at ownership boundaries

Backtracking illustrates the rule:

  • The active path can be mutated with push/pop.
  • A completed answer must be copied before storing.

Copy too little and stored results mutate accidentally. Copy too much and complexity explodes. The correct copy point is where ownership changes.

Common failure modes and traps

  • Using < where <= is required, or the reverse.
  • Moving a pointer backward in a sliding window.
  • Infinite loops from pointer updates that do not guarantee progress.
  • Treating empty input as an error when the contract says it is valid.
  • Returning a sentinel that can be confused with a valid answer.
  • Forgetting recursion base cases for null, empty, or leaf nodes.
  • Mutating a collection while iterating over it.
  • Reusing mutable state across tests or recursion branches.
  • Assuming graph node values are unique when identity is by node reference.
  • Losing duplicates by using a set where counts matter.
  • Overflowing sums, products, or midpoint calculations in fixed-width languages.
  • Treating byte length as user-visible string length.
  • Sorting input in place without asking whether mutation is allowed.

Red flags interviewers may record:

  • “The candidate optimized before establishing correctness.”
  • “The candidate could not explain the invariant.”
  • “The candidate fixed bugs by changing conditions randomly.”
  • “The solution passed the sample but failed straightforward edge cases.”
  • “The implementation relied on assumptions the candidate never stated.”

Interview applications

Clarification

Correctness questions should be specific:

  • “Can the input be empty?”
  • “Can values repeat?”
  • “Should I return any valid answer or all answers?”
  • “Does output order matter?”
  • “May I mutate the input?”
  • “Are intervals closed or half-open?”
  • “Are string characters simple code points for this problem?”
  • “What should happen for invalid input?”

Each question should affect the code.

Implementation

Before coding the main loop, say the invariant. During coding, align names and updates with it.

Example:

“The window is valid after this while loop finishes. I will update best only after restoring validity.”

That one sentence prevents a common bug: recording an answer from an invalid state.

Testing

Correctness tests should attack assumptions:

Risk Test
Empty input [], "", or null if allowed.
Boundary index One item and two items.
Inclusive/exclusive confusion Intervals that touch exactly.
Duplicate handling Repeated values that require counts or distinct indexes.
Mutation aliasing Store a path, then continue mutating the active path.
Overflow Large values near type limits in fixed-width languages.
Nullability Missing child, missing map key, absent optional field.
Unicode Non-ASCII or composed characters when user-visible text matters.
Invalid input Bad interval, malformed record, unknown node, if contract requires validation.

Worked examples

Example 1: Binary search first true

Prompt:

“Given a monotonic boolean function over indexes 0..n-1, return the first index where it is true, or -1 if none exists.”

Correct invariant:

“The first true index, if it exists, is always in [left, right], and answer stores the best true index seen so far.”

Implementation shape:

left = 0
right = n - 1
answer = -1

while left <= right:
    mid = left + (right - left) / 2
    if is_true(mid):
        answer = mid
        right = mid - 1
    else:
        left = mid + 1

return answer

Tests:

Pattern Expected
[false, false, true, true] 2
[true, true] 0
[false, false] -1
[true] 0
[false] -1

Correctness risk: using left < right with closed-interval updates can skip the last candidate.

Example 2: Backtracking combinations

Prompt:

“Return all combinations of k numbers chosen from 1..n.”

Invariant:

“The current path is strictly increasing and contains only choices made for the active recursion branch.”

Mutation rule:

  • push before recursing;
  • pop after recursing;
  • copy the path when storing an answer.

Buggy behavior:

answers.append(path)

If path is mutable, every stored answer may later become the same list. Correct behavior stores a copy at the ownership boundary.

Pruning rule:

If the remaining numbers cannot fill the path to length k, stop that branch. This improves practical performance while preserving correctness because no valid completion exists.

Example 3: Duplicate-aware two sum

Prompt:

“Return indexes of two numbers that sum to target.”

Correctness details:

  • The same array element cannot be used twice.
  • Duplicate values may be necessary, such as [3, 3] with target 6.
  • Index 0 is a valid answer index, so truthiness checks are unsafe.

Senior approach:

“As I scan, I will look for the complement in a map of previously seen values to indexes. I check before inserting the current value so I cannot reuse the same element.”

Tests:

Input Target Expected reason
[2,7,11,15] 9 Normal case.
[3,3] 6 Duplicate values, distinct indexes.
[0,4,3,0] 0 Index 0 and zero value are valid.
[1,2,3] 7 No answer behavior must match contract.

The algorithm is simple; the correctness depends on ordering the lookup and insert correctly.

Practice prompts

Correctness drills

  • For ten common patterns, write the invariant before writing code.
  • Implement binary search using both closed and half-open interval conventions; test one- and two-element cases.
  • Write a sliding-window solution and identify exactly when the window becomes valid.
  • Implement backtracking that stores copied answers and verify stored results do not mutate.
  • For three map-based solutions, test missing key, zero value, and duplicate values.
  • For a string prompt, state the character representation assumption and add one non-ASCII test when relevant.
  • For a graph prompt, test disconnected, cyclic, and duplicate-value nodes.
  • Revisit an old solution and list every implicit assumption about nullability, mutation, and invalid input.

Self-check rubric

Score Evidence
1 You rely on samples and intuition rather than explicit correctness reasoning.
2 You handle common happy paths but miss boundary, duplicate, or mutation bugs.
3 You can state invariants for familiar patterns and test obvious edge cases.
4 You consistently align contract, invariant, loop updates, and tests under time pressure.
5 You can explain correctness trade-offs, recover from failed tests methodically, and identify production data hazards without overbuilding.

Reference card and field reference

Field reference

Correctness before cleverness

  • State the contract before optimizing.
  • Name the invariant before coding the core loop or recursion.
  • Check initialization, maintenance, progress, and termination.
  • Test empty, one-item, two-item, duplicate, boundary, no-answer, and maximum cases.
  • Treat mutation as an ownership question; copy at ownership boundaries.
  • Avoid sentinels that can collide with valid input.
  • Include recursion base cases and stack-depth awareness.
  • Use explicit presence checks when 0, false, or empty strings are valid values.
  • Ask about Unicode representation when user-visible string behavior matters.
  • Clarify invalid input handling; do not overbuild validation when input is guaranteed valid.