alice
Quality gate plugin for Claude Code. Blocks Claude from stopping (via Claude Code's Stop hook) until work passes review by an independent agent. The independent agent uses consensus review when possible (asking Codex and/or Gemini for second opinions).
Be aware:
- It's kind of like a "super ultra extra thinking" mode for Claude Code, with some interesting and useful properties.
- If you're familiar, it makes Claude Code feel like Codex with high/xhigh reasoning, but gives you the coding capabilities and speed of Claude Code.
- Can intentionally be used to run Claude Code on a task for many hours without intervention.
- If you're tight on tokens, I would not recommend unless you're rolling a variant of the Max plan. The reviews are extensive and exhaustive, and the token usage is consequentially large.
- I'm thinking about how to optimize this via prompting, but that work hasn't been done yet.
- When Codex is used within consensus reviewing, in particular, on high or xhigh reasoning will take several minutes to review -- but is extremely thorough. Keep this in mind.
For best results, I would mix in Codex and/or Gemini -- just install the CLIs and auth them, alice will pick them up automatically. Mixing multiple agents into the review process seems to really improve the steering.
What this plugin doesn't solve: what you desire or how you communicate it. Be clear about what you want before you turn it on.
Install
curl -fsSL https://evil-mind-evil-sword.github.io/releases/alice/install.sh | sh
This installs:
jwz - Agent messaging
tissue - Issue tracking
jq - JSON parsing (if needed)
- The alice plugin (registered with Claude Code)
Those other two binaries (jwz and tissue) are small Zig programs which allow Claude Code to store issues, messages, retain state (all in JSONL + SQLite, like beads) -- and are used by alice to track the state required to enforce the reviewer pattern (as well as giving Claude Code a place to store issues, research notes, etc). The plugin assumes these binaries are available and contains explicit instructions for how the agent should use them. The goal here is to make it easy to install these and get started (meaning: the goal is you shouldn't have to think about them!)
Usage
#alice <your prompt>
alice uses the UserInput hook to look at your prompt, parse it, and see if you've invoked #alice. It then uses jwz to set a session message, enabling the Stop hook.
Review is opt-in per-prompt. After alice approves, the gate resets automatically.
Motivation
LLMs struggle to reliably evaluate their own outputs (Huang et al., 2023). A model asked to verify its work tends to confirm rather than critique. This creates a gap in agentic coding workflows—agents can exit believing they've completed a task when issues remain.
Research on multi-agent debate suggests a path forward: models produce more accurate outputs when they critique each other (Du et al., 2023; Liang et al., 2023).
alice applies this idea: rather than prompting agents to review themselves, it blocks exit until an independent reviewer (alice, a subagent) explicitly approves.
How It Works
Agent works → tries to exit → Stop hook → alice reviewed? → block/allow
#alice at start of prompt enables review (using session state stored via jwz)
- Stop hook runs on every agent "stop" attempt (when Claude Code stops and waits for you)
- If review enabled but no approval: blocks exit, agent must spawn alice
- alice (adversarial reviewer) examines the work
- Creates tissue issues for problems found
- Posts decision:
COMPLETE allows exit, ISSUES keeps agent working
- The loop repeats until Alice is satisfied that the main agent has satisfied your prompt task. Alice is directed to ignore the main agent's attempts to convince Alice that they've addressed the task, and instead take an independent perspective.
- Otherwise, Claude Code operates normally.
Architecture