Chapter 33
Correctness Before Cleverness
A technical-foundation chapter on correctness in senior coding interviews, covering invariants, loop correctness, boundary conditions, mutation hazards, integer overflow, nullability, Unicode, duplicates, and invalid input.
Jump around the book
On this page
What interview reasoning this foundation supports
Correctness is the first senior signal in a coding interview because every optimization depends on it.
Clever code that is wrong is not close to good code. A senior candidate may use advanced patterns, but they do not hide the conditions that make those patterns safe. They state invariants, control boundary conditions, update mutable state deliberately, handle duplicates and invalid input according to the contract, and test the cases most likely to break the solution.
This foundation supports four interview behaviors:
| Behavior | Correctness reasoning provides |
|---|---|
| Problem framing | A precise contract for what the code must and must not handle. |
| Algorithm design | An invariant that explains why the algorithm works. |
| Implementation | Boundary and state updates that preserve the invariant. |
| Testing and debugging | Targeted examples that reveal broken assumptions. |
The point is not to write slow code. It is to avoid optimizing a solution whose behavior is not yet trustworthy.
Core concepts
Invariants
An invariant is a statement that remains true at a specific point in the algorithm. It connects code mechanics to correctness.
Examples:
| Pattern | Invariant |
|---|---|
| Sliding window | The current window satisfies the required condition, such as no duplicates or sum within a limit. |
| Binary search | If the target exists, it remains inside the current search range. |
| Merge intervals | The output contains merged, non-overlapping intervals for all intervals processed so far. |
| BFS shortest path | Nodes are visited in nondecreasing distance from the source. |
| Dynamic programming | Each state stores the correct answer for a defined subproblem. |
| Backtracking | The current path represents exactly the choices made along the active recursion branch. |
An invariant should be concrete enough to test. “The algorithm is correct” is not an invariant.
Boundary conditions
Most live coding bugs occur at the edges:
- empty input;
- one element;
- two elements;
- first or last index;
- inclusive versus exclusive endpoints;
- duplicates;
- negative values;
- maximum values;
- all values equal;
- no valid answer;
- multiple valid answers;
- disconnected graph;
- cyclic graph;
- deeply skewed tree.
Boundary conditions are not afterthoughts. They determine loop ranges, comparison operators, base cases, and return values.
Loop correctness
A loop is correct when initialization, maintenance, and termination all preserve the intended result.
For a two-pointer loop:
- Initialization: left and right start at valid positions.
- Maintenance: each pointer movement preserves the invariant or deliberately restores it.
- Progress: at least one pointer moves toward termination each iteration.
- Termination: when the loop exits, all relevant candidates have been considered.
For binary search, the loop condition and midpoint update must match the interval convention:
- closed interval:
left <= right, updateright = mid - 1orleft = mid + 1; - half-open interval:
left < right, updateright = midorleft = mid + 1.
Mixing conventions causes off-by-one bugs that tests often miss unless you include one- and two-element cases.
Mutation hazards
Mutation is not bad. Uncontrolled mutation is bad.
Common hazards:
- storing a reference to a list that later changes;
- mutating input when the caller expects it preserved;
- sharing visited state across independent branches;
- reusing a default mutable value across calls;
- removing from a collection while iterating over it;
- updating one piece of state without updating its companion;
- sorting input in place without permission;
- using global state that survives between test cases.
The senior move is to name ownership:
“I will mutate the
visitedset as traversal state, but I will not mutate the input grid. If mutation were allowed, I could mark cells in place to save memory.”
Integer overflow and numeric correctness
Some languages have arbitrary-precision integers; others have fixed-width integer overflow. Even when overflow is not likely in the interview environment, senior candidates notice numeric limits.
Risk areas:
- computing midpoint as
(left + right) / 2in fixed-width integer languages; - multiplying large dimensions or counts;
- using
intfor sums that may exceed input value range; - using floating point for exact comparisons;
- relying on negative modulo behavior without checking language semantics;
- choosing sentinel values such as
-1orInteger.MAX_VALUEwhen those values may be valid input.
Safer midpoint:
mid = left + (right - left) / 2
Safer sentinel strategy: track presence explicitly when the domain can contain the sentinel.
Nullability and missing values
Absence must be represented deliberately.
Distinguish:
- null input versus empty input;
- missing map key versus key with zero value;
- empty string versus absent string;
- optional field versus required field;
- leaf child absent versus child node with value zero;
- no answer versus answer at index 0.
A common bug is treating truthiness as presence. If 0, false, or "" can be valid, use an explicit containment or null check.
Unicode and string representation
String prompts often pretend characters are simple. Production strings are not always simple.
Possible units:
- bytes;
- code units;
- Unicode code points;
- grapheme clusters visible to users.
For many interview problems, the interviewer expects ordinary character iteration. Senior candidates still ask when it matters:
“Should I treat characters as simple code points for this problem, or do user-visible grapheme clusters matter?”
Do not derail a routine prompt with a Unicode lecture. Do mention representation when the prompt involves user-visible text, length limits, normalization, case folding, or international input.
Duplicates
Duplicates change data structure choice and correctness.
Examples:
- A set loses multiplicity; use a count map when counts matter.
- Two-sum with duplicate values needs distinct indexes.
- Removing duplicates from sorted data requires careful write-index handling.
- Top-k frequency needs tie-breaking rules.
- Graph nodes may have duplicate values but distinct identities.
Ask whether duplicates are possible if the prompt does not say. If they are possible, include at least one duplicate test.
Invalid input
Interview prompts vary. Some guarantee valid input; others expect validation.
Clarify:
- Can input be null?
- Can intervals have start greater than end?
- Can graph edges reference missing nodes?
- Can strings contain invalid encoding?
- Should the function raise, return an error value, or ignore invalid records?
If the prompt guarantees validity, do not overbuild validation. If the prompt is practical coding or production-oriented, validation may be a core requirement.
Decision rules and trade-offs
State the contract before optimizing
Before improving complexity, know what behavior must be preserved:
- Is any valid answer acceptable, or must it be stable and deterministic?
- Is output order significant?
- Are inputs mutable?
- Are duplicate values distinct records?
- Are invalid inputs possible?
- What should happen when there is no solution?
Optimization that changes the contract is a regression, not an improvement.
Prefer a simple correct baseline when the invariant is unclear
If you cannot state why the clever approach works, use a simpler approach to establish behavior.
Example: For “k closest points”, sorting all points by distance is O(n log n), simple, deterministic, and easy to test. A heap or quickselect can improve asymptotics, but each adds correctness risks: comparator direction, tie handling, expected versus worst-case behavior, and mutation of the input. If constraints allow sorting, it may be the right interview choice.
Make illegal states hard to represent
Choose data structures that encode the rule:
- count map instead of set when multiplicity matters;
- enum or explicit status instead of magic strings;
- tuple
(distance, node)when priority and payload must stay together; - half-open interval convention
[start, end)consistently across code; - separate
foundboolean instead of sentinel when sentinel can collide with real data.
The representation should reduce the number of correctness checks you must remember manually.
Update related state together
Many bugs appear when state variables describe the same concept but are updated separately.
Examples:
- moving a sliding-window pointer without removing counts;
- popping a stack without updating a min-stack companion;
- changing parent pointers without changing child references;
- marking a graph node visited after enqueue instead of before, causing duplicates;
- updating best answer before restoring the invariant.
When two variables form one logical state, keep their updates adjacent and narrate the invariant they preserve.
Copy only at ownership boundaries
Backtracking illustrates the rule:
- The active path can be mutated with push/pop.
- A completed answer must be copied before storing.
Copy too little and stored results mutate accidentally. Copy too much and complexity explodes. The correct copy point is where ownership changes.
Common failure modes and traps
- Using
<where<=is required, or the reverse. - Moving a pointer backward in a sliding window.
- Infinite loops from pointer updates that do not guarantee progress.
- Treating empty input as an error when the contract says it is valid.
- Returning a sentinel that can be confused with a valid answer.
- Forgetting recursion base cases for null, empty, or leaf nodes.
- Mutating a collection while iterating over it.
- Reusing mutable state across tests or recursion branches.
- Assuming graph node values are unique when identity is by node reference.
- Losing duplicates by using a set where counts matter.
- Overflowing sums, products, or midpoint calculations in fixed-width languages.
- Treating byte length as user-visible string length.
- Sorting input in place without asking whether mutation is allowed.
Red flags interviewers may record:
- “The candidate optimized before establishing correctness.”
- “The candidate could not explain the invariant.”
- “The candidate fixed bugs by changing conditions randomly.”
- “The solution passed the sample but failed straightforward edge cases.”
- “The implementation relied on assumptions the candidate never stated.”
Interview applications
Clarification
Correctness questions should be specific:
- “Can the input be empty?”
- “Can values repeat?”
- “Should I return any valid answer or all answers?”
- “Does output order matter?”
- “May I mutate the input?”
- “Are intervals closed or half-open?”
- “Are string characters simple code points for this problem?”
- “What should happen for invalid input?”
Each question should affect the code.
Implementation
Before coding the main loop, say the invariant. During coding, align names and updates with it.
Example:
“The window is valid after this
whileloop finishes. I will updatebestonly after restoring validity.”
That one sentence prevents a common bug: recording an answer from an invalid state.
Testing
Correctness tests should attack assumptions:
| Risk | Test |
|---|---|
| Empty input | [], "", or null if allowed. |
| Boundary index | One item and two items. |
| Inclusive/exclusive confusion | Intervals that touch exactly. |
| Duplicate handling | Repeated values that require counts or distinct indexes. |
| Mutation aliasing | Store a path, then continue mutating the active path. |
| Overflow | Large values near type limits in fixed-width languages. |
| Nullability | Missing child, missing map key, absent optional field. |
| Unicode | Non-ASCII or composed characters when user-visible text matters. |
| Invalid input | Bad interval, malformed record, unknown node, if contract requires validation. |
Worked examples
Example 1: Binary search first true
Prompt:
“Given a monotonic boolean function over indexes 0..n-1, return the first index where it is true, or -1 if none exists.”
Correct invariant:
“The first true index, if it exists, is always in
[left, right], andanswerstores the best true index seen so far.”
Implementation shape:
left = 0
right = n - 1
answer = -1
while left <= right:
mid = left + (right - left) / 2
if is_true(mid):
answer = mid
right = mid - 1
else:
left = mid + 1
return answer
Tests:
| Pattern | Expected |
|---|---|
[false, false, true, true] |
2 |
[true, true] |
0 |
[false, false] |
-1 |
[true] |
0 |
[false] |
-1 |
Correctness risk: using left < right with closed-interval updates can skip the last candidate.
Example 2: Backtracking combinations
Prompt:
“Return all combinations of k numbers chosen from 1..n.”
Invariant:
“The current path is strictly increasing and contains only choices made for the active recursion branch.”
Mutation rule:
- push before recursing;
- pop after recursing;
- copy the path when storing an answer.
Buggy behavior:
answers.append(path)
If path is mutable, every stored answer may later become the same list. Correct behavior stores a copy at the ownership boundary.
Pruning rule:
If the remaining numbers cannot fill the path to length k, stop that branch. This improves practical performance while preserving correctness because no valid completion exists.
Example 3: Duplicate-aware two sum
Prompt:
“Return indexes of two numbers that sum to target.”
Correctness details:
- The same array element cannot be used twice.
- Duplicate values may be necessary, such as
[3, 3]with target 6. - Index 0 is a valid answer index, so truthiness checks are unsafe.
Senior approach:
“As I scan, I will look for the complement in a map of previously seen values to indexes. I check before inserting the current value so I cannot reuse the same element.”
Tests:
| Input | Target | Expected reason |
|---|---|---|
[2,7,11,15] |
9 | Normal case. |
[3,3] |
6 | Duplicate values, distinct indexes. |
[0,4,3,0] |
0 | Index 0 and zero value are valid. |
[1,2,3] |
7 | No answer behavior must match contract. |
The algorithm is simple; the correctness depends on ordering the lookup and insert correctly.
Practice prompts
Correctness drills
- For ten common patterns, write the invariant before writing code.
- Implement binary search using both closed and half-open interval conventions; test one- and two-element cases.
- Write a sliding-window solution and identify exactly when the window becomes valid.
- Implement backtracking that stores copied answers and verify stored results do not mutate.
- For three map-based solutions, test missing key, zero value, and duplicate values.
- For a string prompt, state the character representation assumption and add one non-ASCII test when relevant.
- For a graph prompt, test disconnected, cyclic, and duplicate-value nodes.
- Revisit an old solution and list every implicit assumption about nullability, mutation, and invalid input.
Self-check rubric
| Score | Evidence |
|---|---|
| 1 | You rely on samples and intuition rather than explicit correctness reasoning. |
| 2 | You handle common happy paths but miss boundary, duplicate, or mutation bugs. |
| 3 | You can state invariants for familiar patterns and test obvious edge cases. |
| 4 | You consistently align contract, invariant, loop updates, and tests under time pressure. |
| 5 | You can explain correctness trade-offs, recover from failed tests methodically, and identify production data hazards without overbuilding. |
Reference card and field reference
Field reference
Correctness before cleverness
- State the contract before optimizing.
- Name the invariant before coding the core loop or recursion.
- Check initialization, maintenance, progress, and termination.
- Test empty, one-item, two-item, duplicate, boundary, no-answer, and maximum cases.
- Treat mutation as an ownership question; copy at ownership boundaries.
- Avoid sentinels that can collide with valid input.
- Include recursion base cases and stack-depth awareness.
- Use explicit presence checks when
0,false, or empty strings are valid values. - Ask about Unicode representation when user-visible string behavior matters.
- Clarify invalid input handling; do not overbuild validation when input is guaranteed valid.
Related reading
Continue reading
Full table of contents