# **Adversarial Review**: For All, By All

Published: 2026-06-09T09:00:00.000-04:00
Tags: llm, agents, claude-code, open-source, code-review
Canonical: https://www.voodootikigod.com/adversarial-review

> adversarial-review brings skeptical, model-agnostic code review to your harness as the /adversarial-review skill, and to CI as a zero-dependency CLI. Either way, an independent LLM closes the loop the creator agent cannot close for itself.

---

Nobody stops the engineer who wrote the feature from reviewing their own pull request. There is no rule against it. They know the code better than anyone, and they will catch the obvious things. What they will not catch is the decision they already made and stopped questioning: the edge case they told themselves was acceptable, the error path they noted and moved past, the assumption so deep in the implementation it stopped looking like an assumption. The second set of eyes was never about catching what they missed by accident. It is about bringing a reader who does not remember why.

The default agentic workflow skips that step entirely. The agent writes the change, reads back its own diff, and approves it. It already decided every tradeoff in that diff, and re-reading them just confirms the decisions it made minutes ago. The bug it waved through is still sitting there. It has no way to see it.

The [OpenAI Codex CLI harness plugin](https://github.com/openai/codex-plugin-cc/tree/main?ref=voodootikigod.com) fixed this with one move: take the diff the creator agent just produced, hand it to a second model with a clean context window, and ask that model to break confidence in the change. The creator carries every decision it already made. The reviewer carries none of them. The quality delta is real and immediate. In the Agentic Development Lifecycle, this is what has become known as an Adversarial Review.

But the initial implementation only runs inside the Codex runtime. Hit your Codex subscription limit and the review stops mid-loop, the change sits unreviewed until your quota resets. Want the same check on every branch and pull request? There is no path; CI cannot reach into the Codex runtime to run it. The insight is right. The implementation is welded to one runtime.

So I pulled it out of the Codex runtime and made it standalone: `adversarial-review`.

## A different model in a clean context is the whole trick

The quality gain from adversarial review is not mysterious. The creator agent carries the context of everything it decided: the approach it chose, the tradeoffs it made, the edge cases it told itself were acceptable. When you ask that same agent to review its own work, it does not re-evaluate those decisions. It confirms them. It remembers why it made them.

A separate model with a fresh context window has none of that. It sees what is there, not what was intended. It catches the unhandled error path the creator glossed over because the happy path was already in mind. It notices the missing import. It asks whether this actually does what it claims.

Same reason you do not let the engineer who built the feature write the tests and call it QA.

`adversarial-review` is built directly on the design the Codex plugin established. What it adds is the ability to run that same review against any model, any runtime, any harness.

## The loop closes on the reviewer, not the creator

Here is what the workflow looks like with `adversarial-review` wired in:

```mermaid
graph TD
    A[Creator/Editor Agent writes or edits code] --> B[adversarial-review gathers git diff]
    B --> C[Alternative LLM in clean context reviews diff]
    C --> D{Verdict?}
    D -- needs-attention --> E[Editor Agent receives feedback and fixes code]
    E --> B
    D -- approve --> F[Safe to push / merge / ship]
```

The editor agent makes changes. The CLI collects the diff. A different model, different provider, different session, and clean context reads only that diff and has one job: break confidence in the change. It evaluates accuracy, completeness, and adherence to engineering practice. It returns structured JSON: a binary verdict (`approve` or `needs-attention`), and for each finding, the file, line range, severity, and a specific recommendation.

The verdict maps directly to the shell exit code. Zero for `approve`, two for `needs-attention`. That is what makes it drop-in for a pre-commit hook or a CI step. The editor agent gets the findings, fixes the code, and the loop runs again until the reviewer approves.

Since I put this in place, I have shipped fewer bugs. Not zero. But fewer.

## The skill runs the loop.

The entry point is the skill. The repository ships an [Agent Skill](https://agentskills.io/home?ref=voodootikigod.com) ready for you to use in your agentic workflow by dropping it into Claude Code, Codex, Antigravity, or any compliant harness via [Skills.sh](https://skills.sh?ref=voodootikigod.com). You ask "is this branch safe to ship?" or invoke `/adversarial-review` directly, and the agent runs the loop for you, gathers the diff, calls the reviewer, returns the verdict, and if findings come back it feeds them to the editor and loops until the reviewer approves. The whole thing stays inside your normal working session. No separate terminal. No manual invocation.

Getting it there is one command. [Skills.sh](https://skills.sh?ref=voodootikigod.com) reads the skill straight from the GitHub repository and installs it into Claude Code, Codex, Cursor, or any compliant harness:

```bash
npx skills add voodootikigod/adversarial-review
```

From then on it is available in every session. Ask `is this branch safe to ship?` in plain language, or call `/adversarial-review` and scope it the way you would the CLI when you need to:

```bash
# Review the current branch against main, with a focus area
/adversarial-review --base main "focus on the auth boundary"
```

Some harnesses cannot run anything: no npx, no scripts, just a model reading text. The skill still works there. It ships the prompt template and JSON schema as plain reference files, so the agent assembles the review and validates its own output by reasoning alone. No binary, no network, no environment access. That is the [Agent Skills](https://agentskills.io/specification?ref=voodootikigod.com) format working as intended.

For CI pipelines, pre-commit hooks, and anywhere you need the review scripted and non-interactive, `npx` is the path:

```bash
# Review uncommitted working-tree changes
npx adversarial-review

# Review the current branch against develop
npx adversarial-review --base develop

# Direct the reviewer to focus on a specific concern
npx adversarial-review "focus on token expiration and refresh boundaries"

# Print the compiled prompt without calling the LLM
npx adversarial-review --prompt-only > prompt.txt
```

The exit code does the work in automation. Zero for `approve`, two for `needs-attention`, drop it in a workflow step and let the pipeline gate on it.

## One agent implements. A different agent reviews. That split is the whole point.

The single-model setup, where one agent writes and reviews its own work, is fast, convenient, and produces something that looks right. Plausible is the wrong bar when you are shipping automated changes to a production codebase.

After several months of building with these tools, the pattern became consistent. One agent implements. A different agent reviews implementation quality. A third reviews visual quality, which is [its own distinct problem](https://www.voodootikigod.com/gemini-plugin-cc?ref=voodootikigod.com). Each brings different strengths and, critically, none of them carries the assumptions of the one that wrote the code.

That structure did not come from theory. It came from a specific kind of miss I kept seeing. The creator model makes a judgment call somewhere in the implementation, decides an edge case is acceptable or a shortcut is fine for now, and then reviews its own decision as correct, because it already made that decision and remembered why. It is not re-evaluating the call. It is defending it. Whereas in an Adversarial Review, the second model has no decision to protect. It reads the same code, sees the gap the creator reasoned its way past, and flags it on the first pass, almost every time. We are dealing with non-determinism here after all.

`adversarial-review` is the implementation-quality piece of that loop, made model-agnostic so it survives runtime changes, harness changes, and provider changes. The review discipline is what matters. The specific model doing the reviewing should not be a dependency.

## Nothing to audit, nothing a malicious patch can compromise

`adversarial-review` is also pure ES module Node.js, Node 18 and up, using native `fetch` and built-in CLI helpers. No external npm dependencies. Nothing to audit, nothing that can go stale, nothing a malicious patch release can compromise.

There are practical reasons for this beyond ideology. If the tool you are using to verify code quality itself pulls in a graph of transitive dependencies, you have introduced a new attack surface at the most sensitive point in your workflow. The review step is exactly where you want the least moving parts. And running it with `npx` on demand means you always get the current published version without committing a lockfile for a tool that is not part of your build.

The package is published at [npmjs.com/package/adversarial-review](https://www.npmjs.com/package/adversarial-review?ref=voodootikigod.com). `npx adversarial-review` runs immediately. No global install. No version drift. No installation step you forget to do on a new machine.

## The prompt adapts to the change, not to a fixed template

The CLI collects the git context for your repository and makes a structural decision before it formats the prompt. If the change is small, below configurable thresholds for file count and total bytes, it embeds the full diff as primary evidence. The reviewer sees exactly what changed.

For larger changes, it generates a structured summary instead. Agentic models with shell or tool access can then inspect the target files themselves rather than trying to reason about a diff that would exceed their useful context. The shape of the prompt adapts to the size of the change, not to a fixed template.

It formats the prompt, calls the configured model, parses the response using a trial-parsing approach that handles the messy ways different models return structured output, validates against a strict JSON schema, and writes the report. The whole thing is designed to fail noisily if something goes wrong, not silently return a false approve.

## First match wins

You do not need to configure anything to get started. The CLI detects your active environment in order: `ANTHROPIC_API_KEY` first, then `GEMINI_API_KEY`, then `OPENAI_API_KEY`, and finally a local subscription CLI on your `$PATH` (`claude`, `codex`, or `gemini`). First match wins.

If you want to override detection, `--provider` and `--model` are there. You can point it at Claude Sonnet for review even when you are working with Codex as your editor. You can point it at a local Gemini session when you have no API keys at all. Whatever you point it at, read the prompt templates at [voodootikigod/adversarial-review](https://github.com/voodootikigod/adversarial-review?ref=voodootikigod.com) first. Know what you are sending.

That is what model-agnostic means in practice. Not "supports multiple models in a configuration file." Actually falls back through a sensible detection chain and uses whatever is available.

Decoupling it from the Codex runtime is what makes that possible. The second read is no longer the privilege of one subscription or one harness: any model, any pipeline, any branch can run it now. That is what it means to make adversarial review available for all, by all.

* * *

_`adversarial-review` is Apache-2.0 licensed, available as a skill via `npx skills add voodootikigod/adversarial-review` and as a standalone CLI via `npx adversarial-review`. The [Agent Skills Specification](https://agentskills.io/?ref=voodootikigod.com) lives at agentskills.io._
