Help us improve
Share bugs, ideas, or general feedback.
From craft-workspace-webconsulting-skills
Runs a structured code review using Codex, Claude, or other engines as a closeout check before commit or ship.
npx claudepluginhub dirnbauer/webconsulting-skillsHow this skill is triggered — by the user, by Claude, or both
Slash command
/craft-workspace-webconsulting-skills:autoreviewThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Run the bundled structured review helper as a closeout check. This is code review, not Guardian `auto_review` approval routing.
Cross-model code reviews using Codex CLI to catch bias-blind bugs. Claude writes code, Codex reviews — different architectures, no self-approval.
Orchestrates multi-agent code review with Codex CLI, Gemini CLI, and five Claude specialist subagents (security, performance, logic, regression, robustness) then synthesizes findings into verified fixes. Use for deep reviews, second opinions, or council reviews on PRs, commits, or branches.
Conducts tiered code reviews for security (OWASP top 10), performance, and quality on staged git changes, PRs, or files before commits or releases.
Share bugs, ideas, or general feedback.
Run the bundled structured review helper as a closeout check. This is code review, not Guardian auto_review approval routing.
Codex review is the default when no engine is set. It uses gpt-5.5 by default, usually delivers the best review results, and should remain the normal final closeout engine. Claude review is optional and uses claude-fable-5 by default.
Use when:
review still running: ... elapsed=... pid=... as healthy progress, not a hang. Let the helper continue while heartbeats are advancing. Pass --stream-engine-output when live engine text is useful; Codex and Claude filter tool/file chatter, other engines pass raw output through.clawsweeper[bot] or another automation, identify the human trigger when practical. Check timeline/comments first; if rate-limited, use gitcrawl/cache or public PR HTML. Look for maintainer commands such as @clawsweeper automerge, /landpr, or labels/status comments that armed automerge. Report automerge triggered by @login; if not found, say trigger unknown.codex review, nested reviewers, or reviewer panels from inside the review. The helper builds one bundle, calls one selected engine, validates one structured result, and stops.gh/Gitcrawl reports database disk image is malformed, run gitcrawl doctor --json once to let the portable cache repair before retrying review; do not bypass the shim unless repair fails and freshness requires live GitHub.gitcrawl doctor --json and inspect source_db_health, runtime_db_health, and portable_store_status before falling back to live GitHub.Set the skill script paths once, then use "$AUTOREVIEW" and "$AUTOREVIEW_HARNESS" in the examples below.
Choose one:
# Project-local skill in the current repo:
export AUTOREVIEW=".agents/skills/autoreview/scripts/autoreview"
export AUTOREVIEW_HARNESS=".agents/skills/autoreview/scripts/test-review-harness"
# Source checkout of openclaw/agent-skills:
export AUTOREVIEW="skills/autoreview/scripts/autoreview"
export AUTOREVIEW_HARNESS="skills/autoreview/scripts/test-review-harness"
# Global skill:
export AGENTS_HOME="${AGENTS_HOME:-$HOME/.agents}"
export AUTOREVIEW="$AGENTS_HOME/skills/autoreview/scripts/autoreview"
export AUTOREVIEW_HARNESS="$AGENTS_HOME/skills/autoreview/scripts/test-review-harness"
When using Claude Code, set AGENTS_HOME="$HOME/.claude" for global skills. Project-local skills live under .claude/skills/ in the current repo.
Dirty local work:
"$AUTOREVIEW" --mode local
Use this only when the patch is actually unstaged/staged/untracked in the
current checkout. --mode uncommitted is accepted as an alias for --mode local.
For committed, pushed, or PR work, point the helper at the commit
or branch diff instead; do not force dirty modes just
because the helper docs mention dirty work first. A clean local review
only proves there is no local patch.
Branch/PR work:
"$AUTOREVIEW" --mode branch --base origin/main
Optional review context is first-class:
"$AUTOREVIEW" --mode branch --base origin/main --prompt-file /tmp/review-notes.md --dataset /tmp/evidence.json
If an open PR exists, use its actual base:
base=$(gh pr view --json baseRefName --jq .baseRefName)
"$AUTOREVIEW" --mode branch --base "origin/$base"
Committed single change:
"$AUTOREVIEW" --mode commit --commit HEAD
Use commit review for already-landed or already-pushed work on main. Reviewing
clean main against origin/main is usually an empty diff after push. For a
small stack, review each commit explicitly or review the branch before merging
with --base.
Format first if formatting can change line locations. Then it is OK to run tests and review in parallel:
"$AUTOREVIEW" --parallel-tests "<focused test command>"
On Windows, the default --parallel-tests shell preserves the platform cmd.exe
semantics used by Python shell=True. Use --parallel-tests-shell powershell
or --parallel-tests-shell pwsh when the focused test command is PowerShell-specific.
Tradeoff: tests may force code changes that stale the review. If tests or review lead to code edits, rerun the affected tests and rerun review until no accepted/actionable findings remain. Once that rerun exits cleanly, stop; do not spend another long review cycle on redundant confirmation.
Run multiple reviewers against one frozen bundle:
"$AUTOREVIEW" --reviewers codex,claude,pi,droid
--panel is shorthand for Codex plus Claude unless --engine changes the first reviewer:
"$AUTOREVIEW" --panel
Set reviewer models and thinking/effort explicitly:
"$AUTOREVIEW" --reviewers codex,claude --model codex=gpt-5.5 --thinking codex=high --model claude=claude-fable-5 --thinking claude=max
Inline syntax is also supported for simple model IDs:
"$AUTOREVIEW" --reviewers codex:gpt-5.5:high,claude:claude-fable-5:max
For models with slashes or extra colons, prefer keyed form:
"$AUTOREVIEW" --engine pi --model anthropic/claude-sonnet-4 --thinking high
"$AUTOREVIEW" --engine opencode --model opencode/north-mini-code-free --thinking high
"$AUTOREVIEW" --engine droid --model claude-opus-4-8 --thinking low
"$AUTOREVIEW" --reviewers codex,pi --model codex=gpt-5.5 --model pi=anthropic/claude-sonnet-4
"$AUTOREVIEW" --reviewers codex,opencode --model codex=gpt-5.5 --model opencode=opencode/north-mini-code-free
"$AUTOREVIEW" --reviewers codex,droid --model codex=gpt-5.5 --model droid=claude-opus-4-8
The helper accepts --model globally or per engine (engine=model) and --thinking globally or per engine (engine=level). Repeat either flag for multiple reviewers.
Recommended model defaults:
| Engine | Default model | Source note |
|---|---|---|
| codex (default) | gpt-5.5 | OpenAI's current GPT-5.5 alias |
| claude | claude-fable-5 | Anthropic's most capable widely released Claude model |
CLI flags and environment variables override these defaults. Droid, Copilot, Pi, and OpenCode do not get built-in model defaults here because their provider catalogs are external to the Codex/Claude closeout path and may vary by installation.
| Engine | Model flag | Example model IDs | Thinking flag | Accepted levels |
|---|---|---|---|---|
| codex (default) | codex --model X exec ... | gpt-5.5, gpt-5.5-2026-04-23 | -c model_reasoning_effort=Y | none, minimal, low, medium, high, xhigh |
| claude | claude --model X | claude-fable-5, claude-opus-4-8, claude-sonnet-4-6, claude-haiku-4-5 | --effort Y | low, medium, high, xhigh, max |
| droid | droid exec --model X | claude-opus-4-8, Factory model IDs | -r, --reasoning-effort Y | off, none, low, medium, high |
| copilot | copilot --model X | gpt-5.2, Copilot model aliases | not supported | n/a |
| pi | pi --model X | anthropic/claude-sonnet-4, openai/gpt-4o | --thinking Y | off, minimal, low, medium, high, xhigh |
| opencode | opencode run -m X | opencode/north-mini-code-free, OpenCode provider/model IDs | --variant Y | minimal, low, medium, high, max |
Claude also supports --fallback-model a,b for availability-based fallback chains (model-config). Current Claude docs note that auth, billing, rate-limit, request-size, and transport errors do not trigger fallback, and the changelog documents interactive-session support in v2.1.166.
Examples matching current main behavior:
# Codex with explicit model and reasoning
"$AUTOREVIEW" --engine codex --model gpt-5.5 --thinking high
# Claude Code aliases or full model names, with optional availability fallback
"$AUTOREVIEW" --engine claude --model claude-fable-5 --thinking max
"$AUTOREVIEW" --engine claude --model claude-fable-5 --fallback-model claude-opus-4-8,claude-sonnet-4-6
# Factory Droid with explicit model and reasoning effort
"$AUTOREVIEW" --engine droid --model claude-opus-4-8 --thinking low
# GitHub Copilot (model only; no thinking knob)
"$AUTOREVIEW" --engine copilot --model gpt-5.2
# Pi with explicit model and thinking level
"$AUTOREVIEW" --engine pi --model anthropic/claude-sonnet-4 --thinking high --pi-bin pi
# OpenCode with explicit provider/model and variant
"$AUTOREVIEW" --engine opencode --model opencode/north-mini-code-free --thinking high
CLI flags take precedence over environment variables.
| Variable | Purpose |
|---|---|
AUTOREVIEW_MODEL | Override the built-in default --model for all engines |
AUTOREVIEW_THINKING | Default --thinking for all engines |
AUTOREVIEW_FALLBACK_MODEL | Default Claude --fallback-model chain |
AUTOREVIEW_<ENGINE>_MODEL | Per-engine model override, for example AUTOREVIEW_CODEX_MODEL=gpt-5.5 |
AUTOREVIEW_<ENGINE>_THINKING | Per-engine thinking override |
AUTOREVIEW_CLAUDE_FALLBACK_MODEL | Claude-only fallback chain |
Codex maps thinking to model_reasoning_effort. Claude maps thinking to --effort. Droid maps thinking to -r, --reasoning-effort. Pi maps thinking to --thinking. OpenCode maps thinking to --variant. Copilot rejects --thinking. Only Claude accepts --fallback-model; global CLI/env fallback requires at least one Claude reviewer, and engine-specific fallback overrides require that reviewer to be selected. Non-Claude fallback overrides, including AUTOREVIEW_<NONCLAUDE>_FALLBACK_MODEL, fail closed instead of being silently ignored.
When autoreview runs inside the repository under review, external reviewer CLIs must not load project-local trust or configuration that the branch controls.
| Engine | Isolation flags | Reference |
|---|---|---|
| codex | -c project_doc_max_bytes=0, repo trust_level="untrusted", exec --ignore-user-config --ignore-rules, plus read-only sandbox | Codex CLI exec --help |
| claude | --safe-mode --setting-sources user --strict-mcp-config --disallowedTools mcp__* plus explicit --allowedTools (--safe-mode requires Claude Code v2.1.169+) | Claude Code CLI reference |
| pi | --no-approve --no-session --no-context-files --no-extensions --no-skills --no-prompt-templates --no-themes, plus read-only tool allowlist | Pi CLI --help; requires Pi v0.79.0+ |
| opencode | opencode run --dir <repo> --pure --format json, prompt over stdin, neutral subprocess cwd, injected deny-by-default permissions, project config disabled | OpenCode CLI --help |
Codex --ignore-user-config skips config loading for the exec run while keeping CODEX_HOME auth usable. The explicit repo trust override and zero project-doc budget keep reviewed-repo AGENTS.md and .codex/ trust surfaces out of the review prompt. --ignore-rules skips user/project execpolicy rules. Claude --safe-mode disables project hooks, skills, plugins, MCP servers, and CLAUDE.md while preserving normal authentication, model selection, built-in tools, and permissions; managed settings policy can still apply. --setting-sources user avoids project/local settings from the reviewed checkout, and current Claude Code docs note the project-skill blocking behavior was fixed in v2.1.69. --strict-mcp-config and --disallowedTools mcp__* keep MCP unavailable to the review run. --bare is not used here because Claude's headless docs say it skips OAuth and keychain reads. Pi --no-approve ignores project-local files for one run; the helper requires Pi v0.79.0+ plus help output that advertises every required isolation flag because older legacy binaries can ignore unknown flags. The current package is @earendil-works/pi-coding-agent; deprecated @mariozechner/pi-coding-agent 0.73.x is intentionally rejected. Pi version/help probes and the review command run from neutral temporary directories, not the reviewed repo. Pi --no-context-files removes AGENTS.md/CLAUDE.md, the resource-disable flags keep .pi extensions, skills, prompts, and themes out of the run, --no-session avoids writing review sessions, and the read-only allowlist omits bash, edit, and write. OpenCode starts from a neutral temporary directory, points at the reviewed repo with --dir, disables project config through OPENCODE_DISABLE_PROJECT_CONFIG=1, and injects OPENCODE_CONFIG_CONTENT; permissions default to deny, allow read/grep/glob, preserve OpenCode's .env ask rules, and gate websearch/webfetch with --no-web-search. The injected config also clears command/instruction/plugin arrays and disables write/edit/bash/task/skill/todowrite tools without changing user auth storage. The helper sends the review prompt over stdin rather than argv and extracts the final structured JSON from type: "text" events. OpenCode rejects --no-tools.
Run the helper directly so target selection, engine choice, structured validation, and exit status all stay in one path. If output is noisy, summarize the completed helper output after it returns; do not ask another agent or reviewer to rerun the review.
After setting AUTOREVIEW and AUTOREVIEW_HARNESS above:
"$AUTOREVIEW" --help
The smoke harness has thin shell wrappers over a shared Python implementation:
"$AUTOREVIEW_HARNESS" --fixture benign --engine codex
On native Windows, invoke the extensionless Python helper through Python:
python skills\autoreview\scripts\autoreview --help
and the smoke harness:
skills\autoreview\scripts\test-review-harness.ps1 -Fixture benign -Engine codex
The helper:
--mode uncommitted as an alias for --mode localgh pr view worksorigin/main for non-main branches--engine codex, claude, droid, copilot, pi, and opencode; default is AUTOREVIEW_ENGINE or codex; Codex should remain the default when nothing is setgit, gh, reviewer, and PowerShell shell commands from absolute PATH entries only, never from the reviewed checkout; explicit relative --*-bin paths are resolved from the reviewed repository root--mode commit --commit <ref> for already-committed work, especially clean main after landing--mode auto or forced to --mode branch for PR/branch work; do not force --mode local after committing--output, --json-output, or live streamed engine stderr is set--dry-run, --parallel-tests, --parallel-tests-shell, --prompt, --prompt-file, --dataset, --no-tools, --no-web-search, and commit refs--stream-engine-output or AUTOREVIEW_STREAM_ENGINE_OUTPUT=1 for live engine text while preserving structured validation; Codex and Claude hide tool/file event details, emit compact activity summaries, and report usage at turn completion--panel / --reviewers, plus per-engine --model, --thinking, and Claude --fallback-modelcodex=gpt-5.5 and claude=claude-fable-5; honors AUTOREVIEW_MODEL, AUTOREVIEW_THINKING, AUTOREVIEW_FALLBACK_MODEL, and per-engine AUTOREVIEW_<ENGINE>_MODEL / AUTOREVIEW_<ENGINE>_THINKING environment overrides when CLI flags are omittedcodex exec with read-only sandbox, reviewed-repo instruction/config/rule isolation flags, and structured output--safe-mode (v2.1.169+), --setting-sources user, MCP disabled, explicit allowed tools, and --fallback-model when set, so reviewed-repo hooks/skills/MCP do not affect the review run while normal auth still works; managed settings policy can still applydroid exec in read-only mode, forwards --model and -r, --reasoning-effort, and switches --output-format to stream-json when streaming is enabledv0.79.0+ from neutral temporary directories with --no-approve, --no-session, disabled Pi context/resource loading, and built-in read-only tools (read,grep,find,ls) when tools are enabledopencode run --dir <repo> --pure --format json from a neutral temporary directory, forwards --model and --variant, injects deny-by-default permissions, disables project config loading, and passes the review prompt over stdinreview still running: <engine> elapsed=<seconds>s pid=<pid> to stderr at long-running intervals while waiting for the selected review engine, unless streamed output or compact Codex activity has been visible recentlyautoreview clean: no accepted/actionable findings reported when the selected review command exits 0Include:
Do not run another review solely to improve the final report wording. If the final helper run exited 0 and produced no accepted/actionable findings, report that exact run as clean.
This skill is based on the excellent work by Peter Steinberger.
Original repository: https://github.com/openclaw/agent-skills
Copyright (c) 2026 openclaw - Auto Review structured review workflow and helper scripts (MIT License)
Special thanks to Peter Steinberger for their generous open-source contributions, which helped shape this skill collection. Adapted by webconsulting.at for this skill collection