/review -- Two-pass code review

First pass: does it work? Second pass: what could break it?

Pass 1: Structural correctness

The first pass checks whether the code does what the plan says. It reads plan.json and walks through every step, verifying the implementation matches the intent.

What it checks:

Every planned file exists and contains the expected logic.
Types are correct and consistent across boundaries.
Imports resolve. No circular dependencies introduced.
Error handling exists where the plan specified it.
No dead code or unreachable branches.

When the agent finds an issue it can fix without changing behavior, it applies an auto-fix and marks the finding as AUTO-FIX. Missing null checks, unused imports, incorrect types with obvious corrections. These do not need human approval.

Pass 2: Adversarial review

The second pass tries to break the code. The agent switches perspective: instead of checking if the code works, it looks for ways it fails.

Edge cases -- Empty arrays, null values, zero-length strings, maximum integers, Unicode in unexpected places.
Race conditions -- Concurrent access to shared state, async operations that assume ordering, missing locks.
Security surface -- User input reaching SQL, HTML, or shell commands. Missing authentication checks. Overly broad permissions.
Resource leaks -- Open file handles, database connections not returned to pools, event listeners not cleaned up.
Failure modes -- What happens when the network is down, the disk is full, the database returns an error, the API rate-limits you.

Scope drift detection

The review compares every modified file against plan.json. Files that were not in planned_files get flagged.

{
  "finding": "scope_drift",
  "file": "src/utils/format.ts",
  "severity": "warning",
  "detail": "File modified but not listed in plan.json",
  "action": "ASK"
}

The agent will not auto-fix scope drift. It asks you to either justify the change or revert it.

Conflict detection with /security

Some review findings contradict security findings. The classic case: /review wants detailed error messages for debugging, /security wants generic messages to avoid information disclosure.

When the review runs, it pre-scans for likely security conflicts and tags them. The /security phase reads these tags and produces a unified recommendation instead of contradicting the review.

{
  "finding": "error_detail_conflict",
  "review_says": "Add stack trace to error response for debugging",
  "security_says": "Remove stack trace, exposes internals (OWASP A01)",
  "resolution": "Stack trace in dev mode only, generic message in production"
}

Finding classification

Every finding gets two labels: an action and a severity.

Actions

AUTO-FIX -- The agent can fix this without changing behavior. Applied immediately.
ASK -- Requires human judgment. The agent explains the tradeoff and waits.

Severities

critical -- Data loss, security hole, crashes. Blocks the sprint.
high -- Incorrect behavior under normal conditions. Blocks the sprint.
medium -- Incorrect behavior under edge conditions. Does not block but gets reported.
low -- Style, naming, minor improvements. Noted in the journal.

Intensity modes

Control how deep the review goes.

--quick -- Pass 1 only. Structural correctness, no adversarial review. Good for small changes you are confident about.
--standard -- Both passes. Default mode.
--thorough -- Both passes plus cross-file analysis. Checks how your changes interact with existing code paths that were not modified. Slower but catches integration bugs.