Search everything...

Skill

vibe

Validates code readiness using complexity analysis (radon/gocyclo), bug hunt audits, and multi-model council reviews for recent git changes or specific paths.

Git

Python

code-quality

security

Install

npx claudepluginhub boshu2/agentops --plugin agentops

Similar Skills

using-git-worktrees

Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.

superpowers

168.3k

subagent-driven-development

3 files

Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.

superpowers

168.3k

dispatching-parallel-agents

Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.

superpowers

168.3k

Stats

Stars314

Forks32

Last CommitApr 26, 2026

Actions

View Source View Plugin View on GitHub View README

vibe

From agentops

Validates code readiness using complexity analysis (radon/gocyclo), bug hunt audits, and multi-model council reviews for recent git changes or specific paths.

Install

npx claudepluginhub boshu2/agentops --plugin agentops

Tool Access

This skill uses the workspace's default tool permissions.

Preview

> **Purpose:** Is this code ready to ship?

Supporting Assets

SKILL.md

Vibe Skill

Purpose: Is this code ready to ship?

Three steps:

Complexity analysis — Find hotspots (radon, gocyclo)
Bug hunt audit — Systematic sweep for concrete bugs
Council validation — Multi-model judgment

Quick Start

/vibe                                    # validates recent changes
/vibe recent                             # same as above
/vibe src/auth/                          # validates specific path
/vibe --quick recent                     # fast inline check, no agent spawning
/vibe --structured recent                # 6-phase verification report (build→types→lint→tests→security→diff)
/vibe --deep recent                      # 3 judges instead of 2
/vibe --sweep recent                     # deep audit: per-file explorers + council
/vibe --mixed recent                     # cross-vendor (Claude + Codex)
/vibe --preset=security-audit src/auth/  # security-focused review
/vibe --explorers=2 recent               # judges with explorer sub-agents
/vibe --debate recent                    # two-round adversarial review
/vibe --tier=quality recent              # use quality tier for council calls

Execution Steps

Step 0: Load Prior Review Context

Before reviewing, pull relevant learnings from prior code reviews and known patterns:

if command -v ao &>/dev/null; then
    ao lookup --query "<target-scope> code review patterns" --limit 3 2>/dev/null || true
fi

Apply retrieved knowledge (mandatory when results returned):

If learnings or patterns are returned, do NOT just load them as passive context. For each returned item:

Check: does this learning apply to the code under review? (answer yes/no)
If yes: include it as a known_risk in your review — state the pattern, what to look for, and whether the code exhibits it
Cite the learning by filename in your review output when it influences a finding

After applying, record the citation:

ao metrics cite "<learning-path>" --type applied 2>/dev/null || true

Skip silently if ao is unavailable or returns no results.

Project reviewer config: If .agents/reviewer-config.md exists, its full config (reviewers, plan_reviewers, skip_reviewers) is passed to council for judge selection. See skills/council/SKILL.md Step 1b.

Crank Checkpoint Detection

Before scanning for changed files via git diff, check if a crank checkpoint exists:

if [ -f .agents/vibe-context/latest-crank-wave.json ]; then
    echo "Crank checkpoint found — using files_changed from checkpoint"
    FILES_CHANGED=$(jq -r '.files_changed[]' .agents/vibe-context/latest-crank-wave.json 2>/dev/null)
    WAVE_COUNT=$(jq -r '.wave' .agents/vibe-context/latest-crank-wave.json 2>/dev/null)
    echo "Wave $WAVE_COUNT checkpoint: $(echo "$FILES_CHANGED" | wc -l | tr -d ' ') files changed"
fi

When a crank checkpoint is available, use its files_changed list instead of re-detecting via git diff. This ensures vibe validates exactly the files that crank modified.

Step 1: Determine Target

If target provided: Use it directly.

If no target or "recent": Auto-detect from git:

# Check recent commits
git diff --name-only HEAD~3 2>/dev/null | head -20

If nothing found, ask user.

Pre-flight: If no files found: Return immediately with: "PASS (no changes to review) — no modified files detected." Do NOT spawn agents for empty file lists.

Step 1.5a: Structured Verification Path (--structured mode)

If --structured flag is set, run a 6-phase mechanical verification pipeline instead of the council flow. This produces a machine-readable verification report suitable for PR gates and CI integration.

Phases: Build → Types → Lint → Tests → Security → Diff Review.

Read references/verification-report.md for the full report template and per-phase commands. Each phase is fail-fast — if Build fails, skip remaining phases and report NOT READY.

After all phases complete, write the structured report to .agents/council/YYYY-MM-DD-verification-<target>.md and output the summary table to the user.

When to use: Pre-PR gate, CI integration, when you need a mechanical pass/fail rather than judgment-based review.

Step 1.5: Fast Path (--quick mode)

If --quick flag is set, skip Steps 2a through 2e as heavy pre-processing, plus 2.5 and 2f, and jump to Step 4 with inline council after Steps 2.3, 2.4, 2g, and Step 3. Domain checklists, compiled-prevention loading, test-pyramid inventory, and inline product context are cheap and high-value, so they still run in quick mode. Complexity analysis (Step 2) still runs — it's cheap and informative.

Why: Steps 2.5 and 2a–2f add 30–90 seconds of pre-processing that mainly feed multi-judge council packets. In --quick mode (single inline agent), those inputs are not worth the cost, but test-pyramid and product-context checks still shape the inline review meaningfully.

Step 2: Run Complexity Analysis

Filter by language present in the change set first. Run only the analyzers whose language actually appears in the diff. A docs/shell/BATS-only epic must NOT trigger gocyclo against the entire cli/ tree (it has hung in past runs); a Python-free epic must NOT trigger radon.

# Detect which languages are present in the diff (or in <path> for full audits).
# Use `git diff --name-only <base>...HEAD` for a PR; fall back to listing
# files under <path> when no diff base is available.
mkdir -p .agents/council
HAS_GO=false; HAS_PY=false
DIFF_FILES="$(git diff --name-only "${BASE:-HEAD~1}"...HEAD 2>/dev/null || find <path> -type f)"
echo "$DIFF_FILES" | grep -q '\.go$'  && HAS_GO=true
echo "$DIFF_FILES" | grep -q '\.py$'  && HAS_PY=true
echo "$(date -Iseconds) preflight: HAS_GO=$HAS_GO HAS_PY=$HAS_PY" >> .agents/council/preflight.log

For Python (only when HAS_PY=true):

if [ "$HAS_PY" = "true" ]; then
  echo "$(date -Iseconds) preflight: checking radon" >> .agents/council/preflight.log
  if ! which radon >> .agents/council/preflight.log 2>&1; then
    echo "⚠️ COMPLEXITY SKIPPED: radon not installed (pip install radon)"
  else
    radon cc <path> -a -s 2>/dev/null | head -30
    radon mi <path> -s 2>/dev/null | head -30
  fi
else
  echo "ℹ️ COMPLEXITY SKIPPED: no .py files in diff"
fi

For Go (only when HAS_GO=true):

if [ "$HAS_GO" = "true" ]; then
  echo "$(date -Iseconds) preflight: checking gocyclo" >> .agents/council/preflight.log
  if ! which gocyclo >> .agents/council/preflight.log 2>&1; then
    echo "⚠️ COMPLEXITY SKIPPED: gocyclo not installed (go install github.com/fzipp/gocyclo/cmd/gocyclo@latest)"
  else
    gocyclo -over 10 <path> 2>/dev/null | head -30
  fi
else
  echo "ℹ️ COMPLEXITY SKIPPED: no .go files in diff"
fi

For other languages: Skip complexity with explicit note: "⚠️ COMPLEXITY SKIPPED: No analyzer for "

Interpret results:

Score	Rating	Action
A (1-5)	Simple	Good
B (6-10)	Moderate	OK
C (11-20)	Complex	Flag for council
D (21-30)	Very complex	Recommend refactor
F (31+)	Untestable	Must refactor

Include complexity findings in council context.

Step 2.3: Load Domain-Specific Checklists

Detect code patterns in the target files and load matching domain-specific checklists from standards/references/:

Trigger	Checklist	Detection
SQL/ORM code	`sql-safety-checklist.md`	Files contain SQL queries, ORM imports (`database/sql`, `sqlalchemy`, `prisma`, `activerecord`, `gorm`, `knex`), or migration files in changeset
LLM/AI code	`llm-trust-boundary-checklist.md`	Files import `anthropic`, `openai`, `google.generativeai`, or match `llm`, `prompt`, `completion` patterns
Concurrent code	`race-condition-checklist.md`	Files use goroutines, `threading`, `asyncio`, `multiprocessing`, `sync.Mutex`, `concurrent.futures`, or shared file I/O patterns
Codex skills	`codex-skill.md`	Files under `skills-codex/`, or files matching `codexSKILL.md`, `convert.sh`, `skills-codex-overrides/`, or converter scripts

For each matched checklist, load it via the Read tool and include relevant items in the council packet as context.domain_checklists. Multiple checklists can be loaded simultaneously.

Skip silently if no patterns match. This step runs in both --quick and full modes (domain checklists are cheap to load and high-value).

Steps 2.4-2f, 2h, 3-3.6 (Deep Checks & Pre-Council Prep): Read references/deep-checks.md for compiled prevention, prior findings, pre-council deep analysis checks, product context, spec loading, suppressions, pre-mortem correlation, and model cost tiers. Loaded automatically unless --quick mode is set. In --quick mode, skip directly to Step 2g.

Compiled prevention inputs: Load .agents/pre-mortem-checks/ and .agents/planning-rules/ when available. These compiled artifacts contain known_risks from prior findings that inform the review — carry matched finding IDs into council context so judges can assess whether the flywheel prevented rediscovery.

Step 2a: Prior Findings Check

Skip if --quick. Load prior findings from .agents/findings/registry.jsonl.

Step 2b: Constraint Tests

Skip if --quick. Run compiled constraint tests from .agents/constraints/.

Step 2c: Metadata Checks

Skip if --quick. Verify file metadata consistency.

Step 2.5: OL Validation

Skip if --quick. Run organizational-lint checks.

Step 2d: Knowledge Search

Skip if --quick. Search for relevant prior learnings via ao lookup.

Step 2e: Bug Hunt

Skip if --quick. Run proactive bug-hunt audit on target files.

Step 2f: Codex Review

Skip if --quick. When --mixed is passed and Codex CLI is available, send the first 2000 chars of the diff to Codex for a parallel review. Cap input at 2000 chars to stay within Codex context budgets.

Step 3: Product Context

Skip if --quick as a separate judge-fanout step. When PRODUCT.md exists and the user did not pass an explicit --preset override, quick mode still loads DX expectations inline in the single-agent review. In non-quick modes, add a DX (developer experience) judge: 2 independent + 1 DX judge (3 judges total). The DX judge evaluates whether the code aligns with the product's stated personas and value propositions.

Step 2g: Test Pyramid Inventory (MANDATORY)

Assess test coverage against the test pyramid standard (the test pyramid standard (loaded via /standards)).

Read skills/vibe/references/test-pyramid-weighting.md for test pyramid weighting — L3+ tests found all production bugs, weight them 5x.

Test Pyramid Weighting: Weight test coverage by level: L0–L1 at 1x, L2 at 3x, L3+ at 5x. Unit-only coverage is a WARN signal, not a PASS. See references/test-pyramid-weighting.md.

Run even in --quick mode — this is cheap (file existence checks) and high-signal.

Identify changed modules from git diff or target scope
For each changed module, check coverage pyramid (L0–L3):
- L0: Does a contract/spec enforcement test cover this module?
- L1: Does a unit test file exist for this module?
- L2: If module crosses boundaries, does an integration test exist?
For boundary-touching code, check bug-finding pyramid (BF1–BF5):
- BF4 (Chaos): Do external call sites have failure injection tests?
- BF1 (Property): Do data transformations have property tests?
- BF2 (Golden): Do output generators have golden file tests?
Compute weighted pyramid score for changed code paths:

Formula:
```
weighted_score = (L0_count x 1 + L1_count x 1 + L2_count x 3 + L3_count x 5 + L4_count x 5) / max_possible
```
Where max_possible = total_test_count x 5 (the score if every test were L3+).

Count tests at each level for changed code paths:
- L0: Build/compile checks (weight 1)
- L1: Unit tests (weight 1)
- L2: Integration tests (weight 3)
- L3: E2E/system tests (weight 5)
- L4: Smoke/fresh-context tests (weight 5)
Interpretation:
- weighted_score >= 0.6 — strong pyramid, L2+ tests present
- 0.3 <= weighted_score < 0.6 — acceptable, but recommend more integration tests
- weighted_score < 0.3 AND all tests are L0-L1 only — WARN: unit-only test coverage (feeds into vibe verdict as a WARN signal, not a separate gate)
Satisfaction exposure: The weighted_score is also exposed as satisfaction_score (with source "test-pyramid-weighted") in the test_pyramid output block AND promoted to the top-level verdict JSON as satisfaction_score (verdict schema field, skills/council/schemas/verdict.json: number 0.0-1.0, "Probabilistic satisfaction score (0.0 = unsatisfied, 1.0 = fully satisfied). Optional — absent means not computed."). Downstream consumers (e.g., /validation STEP 1.8 holdout evaluation) can use satisfaction_score as a normalized quality signal.

Include in council packet and vibe report output:
```
## Test Pyramid Score
| Level | Count | Weight | Contribution |
|-------|-------|--------|--------------|
| L0    | 2     | 1x     | 2            |
| L1    | 8     | 1x     | 8            |
| L2    | 0     | 3x     | 0            |
| L3    | 0     | 5x     | 0            |
| L4    | 0     | 5x     | 0            |
| **Total** | **10** | | **10 / 50 = 0.20** |
WARN: weighted_score 0.20 < 0.3 and all tests are L0-L1 only
```
Build coverage table and include in council packet as context.test_pyramid:

"test_pyramid": {
  "coverage": {
    "L0": {"status": "pass", "files": ["test_spec_enforcement.py"]},
    "L1": {"status": "pass", "files": ["test_module.py"]},
    "L2": {"status": "gap", "reason": "crosses subsystem boundary, no integration test"}
  },
  "bug_finding": {
    "BF4_chaos": {"status": "gap", "reason": "external API calls without failure injection"},
    "BF1_property": {"status": "na", "reason": "no data transformations in scope"}
  },
  "weighted_score": 0.20,
  "satisfaction_score": 0.20,
  "satisfaction_source": "test-pyramid-weighted",
  "score_breakdown": {"L0": 2, "L1": 8, "L2": 0, "L3": 0, "L4": 0},
  "max_possible": 50,
  "warn_unit_only": true,
  "verdict": "WARN: weighted_score 0.20 < 0.3, all tests L0-L1 only"
}

Verdict rules:

weighted_score < 0.3 AND all tests L0-L1 only — WARN: unit-only coverage (include in council findings)
Missing L1 on feature code — WARN (include in council findings)
Missing L0 on spec-changing code — WARN
Missing BF4 on boundary code — WARN (advisory, not blocking)
All levels covered with weighted_score >= 0.6 — no mention needed

When coverage gaps are found, run /test <module> to generate test candidates for uncovered code.

Step 4: Run Council Validation

With spec found — use code-review preset:

/council --preset=code-review validate <target>

error-paths: Trace every error handling path. What's uncaught? What fails silently?
api-surface: Review every public interface. Is the contract clear? Breaking changes?
spec-compliance: Compare implementation against the spec. What's missing? What diverges?

The spec content is injected into the council packet context so the spec-compliance judge can compare implementation against it.

Without spec — 2 independent judges (no perspectives):

/council validate <target>

2 independent judges (no perspective labels). Use --deep for 3 judges on high-stakes reviews. Override with --quick (inline single-agent check) or --mixed (cross-vendor with Codex).

Council receives:

Files to review
Complexity hotspots (from Step 2)
Git diff context
Spec content (when found, in context.spec)
Sweep manifest (when --deep or --sweep, in context.sweep_manifest — judges shift to adjudication mode, see references/deep-audit-protocol.md)

All council flags pass through: --quick (inline), --mixed (cross-vendor), --preset=<name> (override perspectives), --explorers=N, --debate (adversarial 2-round), --tier=<name> (model cost tier: quality/balanced/budget). See Quick Start examples and /council docs.

Step 5: Council Checks

Each judge reviews for:

Aspect	What to Look For
Correctness	Does code do what it claims?
Security	Injection, auth issues, secrets
Edge Cases	Null handling, boundaries, errors
Quality	Dead code, duplication, clarity
Complexity	High cyclomatic scores, deep nesting
Architecture	Coupling, abstractions, patterns

Step 6: Interpret Verdict

Council Verdict:

Council Verdict	Vibe Result	Action
PASS	Ready to ship	Merge/deploy
WARN	Review concerns	Address or accept risk
FAIL	Not ready	Fix issues

Step 7: Write Vibe Report

Write to: .agents/council/YYYY-MM-DD-vibe-<target>.md (use date +%Y-%m-%d)

Read references/report-format.md for the full vibe report markdown template. The report includes: complexity analysis, council verdict table, shared/critical/informational findings, all findings (when --deep/--sweep), recommendation, and decision checkboxes.

Step 8: Report to User

Tell the user:

Complexity hotspots (if any)
Council verdict (PASS/WARN/FAIL)
Key concerns
Location of vibe report

Step 9: Record Ratchet Progress

After council verdict:

If verdict is PASS or WARN:
- Run: ao ratchet record vibe --output "<report-path>" 2>/dev/null || true
- Suggest: "Run /post-mortem to capture learnings and complete the cycle."

If verdict is FAIL:

Do NOT record ratchet progress.

Extract ALL findings from the council report for structured retry context (group by category if >20):

Read the council report. For each finding, format as:
FINDING: <description> | FIX: <fix or recommendation> | REF: <ref or location>

Fallback for v1 findings (no fix/why/ref fields):
  fix = finding.fix || finding.recommendation || "No fix specified"
  ref = finding.ref || finding.location || "No reference"

Tell user to fix issues and re-run /vibe, including the formatted findings as actionable guidance.

Step 9.5: Feed Findings to Flywheel

If verdict is WARN or FAIL, persist reusable findings to .agents/findings/registry.jsonl and optionally mirror the broader narrative to a learning file.

Registry write rules:

persist only reusable issues that should change future review or implementation behavior
require dedup_key, provenance, pattern, detection_question, checklist_item, applicable_when, and confidence
applicable_when must use the controlled vocabulary from the finding-registry contract
append or merge by dedup_key
use the contract's temp-file-plus-rename atomic write rule

If a broader prose summary still helps, also write the existing anti-pattern learning file to .agents/learnings/YYYY-MM-DD-vibe-<target>.md. Skip both if verdict is PASS.

After the registry update, if hooks/finding-compiler.sh exists, run:

bash hooks/finding-compiler.sh --quiet 2>/dev/null || true

This keeps the same-session post-mortem path synchronized with the latest reusable findings. session-end-maintenance.sh remains the idempotent backstop.

Step 10: Test Bead Cleanup

After validation completes, clean up stale test beads (bd list --status=open | grep -iE "test bead|test quest") via bd close to prevent bead pollution. Skip if bd unavailable.

Integration with Workflow

/implement issue-123
    │
    ▼
(coding, quick lint/test as you go)
    │
    ▼
/vibe                      ← You are here
    │
    ├── Complexity analysis (find hotspots)
    ├── Bug hunt audit (find concrete bugs)
    └── Council validation (multi-model judgment)
    │
    ├── PASS → ship it
    ├── WARN → review, then ship or fix
    └── FAIL → fix, re-run /vibe

Examples

User says: "Run a quick validation on the latest changes."

Do:

/vibe recent

Validate Recent Changes

/vibe recent

Runs complexity on recent changes, then council reviews.

Validate Specific Directory

/vibe src/auth/

Complexity + council on auth directory.

Deep Review

/vibe --deep recent

Complexity + 3 judges for thorough review.

Cross-Vendor Consensus

/vibe --mixed recent

Complexity + Claude + Codex judges.

See references/examples.md for additional examples: security audit with spec compliance, developer-experience code review with PRODUCT.md, and fast inline checks.

Troubleshooting

Problem	Cause	Solution
"COMPLEXITY SKIPPED: radon not installed"	Python complexity analyzer missing	Install with `pip install radon` or skip complexity (council still runs).
"COMPLEXITY SKIPPED: gocyclo not installed"	Go complexity analyzer missing	Install with `go install github.com/fzipp/gocyclo/cmd/gocyclo@latest` or skip.
Vibe returns PASS but constraint tests fail	Council LLMs miss mechanical violations	Check `.agents/council/<timestamp>-vibe-*.md` for constraint test results. Failed constraints override council PASS. Fix violations and re-run.
Codex review skipped	`--mixed` not passed, Codex CLI not on PATH, or no uncommitted changes	Codex review is opt-in — pass `--mixed` to enable. Also requires Codex CLI on PATH and uncommitted changes.
"No modified files detected"	Clean working tree, no recent commits	Make changes or specify target path explicitly: `/vibe src/auth/`.
Spec-compliance judge not spawned	No spec found in beads/plans	Reference bead ID in commit message or create plan doc in `.agents/plans/`. Without spec, vibe uses 2 independent judges (3 with `--deep`).

Write-Time Quality Hook

The hooks/write-time-quality.sh PostToolUse hook runs automatically after every Write/Edit tool call, catching common anti-patterns at edit time rather than review time. It checks:

Go: unchecked errors, fmt.Print in library code
Python: bare except:, eval/exec, missing type hints on public functions
Shell: missing set -euo pipefail, unquoted variables

The hook is non-blocking (always exits 0) and outputs warnings via JSON. See references/write-time-quality.md for the full design.

Reference Documents

Similar Skills

using-git-worktrees

Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.

superpowers

168.3k

subagent-driven-development

3 files

Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.

superpowers

168.3k

dispatching-parallel-agents

Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.

superpowers

168.3k

Stats

Stars314

Forks32

Last CommitApr 26, 2026

Actions

View Source View Plugin View on GitHub View README

Vibe Skill

Purpose: Is this code ready to ship?

Three steps:

Complexity analysis — Find hotspots (radon, gocyclo)
Bug hunt audit — Systematic sweep for concrete bugs
Council validation — Multi-model judgment

Quick Start

/vibe                                    # validates recent changes
/vibe recent                             # same as above
/vibe src/auth/                          # validates specific path
/vibe --quick recent                     # fast inline check, no agent spawning
/vibe --structured recent                # 6-phase verification report (build→types→lint→tests→security→diff)
/vibe --deep recent                      # 3 judges instead of 2
/vibe --sweep recent                     # deep audit: per-file explorers + council
/vibe --mixed recent                     # cross-vendor (Claude + Codex)
/vibe --preset=security-audit src/auth/  # security-focused review
/vibe --explorers=2 recent               # judges with explorer sub-agents
/vibe --debate recent                    # two-round adversarial review
/vibe --tier=quality recent              # use quality tier for council calls

Execution Steps

Step 0: Load Prior Review Context

Before reviewing, pull relevant learnings from prior code reviews and known patterns:

if command -v ao &>/dev/null; then
    ao lookup --query "<target-scope> code review patterns" --limit 3 2>/dev/null || true
fi

Apply retrieved knowledge (mandatory when results returned):

If learnings or patterns are returned, do NOT just load them as passive context. For each returned item:

Check: does this learning apply to the code under review? (answer yes/no)
If yes: include it as a known_risk in your review — state the pattern, what to look for, and whether the code exhibits it
Cite the learning by filename in your review output when it influences a finding

After applying, record the citation:

ao metrics cite "<learning-path>" --type applied 2>/dev/null || true

Skip silently if ao is unavailable or returns no results.

Crank Checkpoint Detection

Before scanning for changed files via git diff, check if a crank checkpoint exists:

if [ -f .agents/vibe-context/latest-crank-wave.json ]; then
    echo "Crank checkpoint found — using files_changed from checkpoint"
    FILES_CHANGED=$(jq -r '.files_changed[]' .agents/vibe-context/latest-crank-wave.json 2>/dev/null)
    WAVE_COUNT=$(jq -r '.wave' .agents/vibe-context/latest-crank-wave.json 2>/dev/null)
    echo "Wave $WAVE_COUNT checkpoint: $(echo "$FILES_CHANGED" | wc -l | tr -d ' ') files changed"
fi

When a crank checkpoint is available, use its files_changed list instead of re-detecting via git diff. This ensures vibe validates exactly the files that crank modified.

Step 1: Determine Target

If target provided: Use it directly.

If no target or "recent": Auto-detect from git:

# Check recent commits
git diff --name-only HEAD~3 2>/dev/null | head -20

If nothing found, ask user.

Pre-flight: If no files found: Return immediately with: "PASS (no changes to review) — no modified files detected." Do NOT spawn agents for empty file lists.

Step 1.5a: Structured Verification Path (--structured mode)

Phases: Build → Types → Lint → Tests → Security → Diff Review.

Read references/verification-report.md for the full report template and per-phase commands. Each phase is fail-fast — if Build fails, skip remaining phases and report NOT READY.

After all phases complete, write the structured report to .agents/council/YYYY-MM-DD-verification-<target>.md and output the summary table to the user.

When to use: Pre-PR gate, CI integration, when you need a mechanical pass/fail rather than judgment-based review.

Step 1.5: Fast Path (--quick mode)

Step 2: Run Complexity Analysis

# Detect which languages are present in the diff (or in <path> for full audits).
# Use `git diff --name-only <base>...HEAD` for a PR; fall back to listing
# files under <path> when no diff base is available.
mkdir -p .agents/council
HAS_GO=false; HAS_PY=false
DIFF_FILES="$(git diff --name-only "${BASE:-HEAD~1}"...HEAD 2>/dev/null || find <path> -type f)"
echo "$DIFF_FILES" | grep -q '\.go$'  && HAS_GO=true
echo "$DIFF_FILES" | grep -q '\.py$'  && HAS_PY=true
echo "$(date -Iseconds) preflight: HAS_GO=$HAS_GO HAS_PY=$HAS_PY" >> .agents/council/preflight.log

For Python (only when HAS_PY=true):

if [ "$HAS_PY" = "true" ]; then
  echo "$(date -Iseconds) preflight: checking radon" >> .agents/council/preflight.log
  if ! which radon >> .agents/council/preflight.log 2>&1; then
    echo "⚠️ COMPLEXITY SKIPPED: radon not installed (pip install radon)"
  else
    radon cc <path> -a -s 2>/dev/null | head -30
    radon mi <path> -s 2>/dev/null | head -30
  fi
else
  echo "ℹ️ COMPLEXITY SKIPPED: no .py files in diff"
fi

For Go (only when HAS_GO=true):

if [ "$HAS_GO" = "true" ]; then
  echo "$(date -Iseconds) preflight: checking gocyclo" >> .agents/council/preflight.log
  if ! which gocyclo >> .agents/council/preflight.log 2>&1; then
    echo "⚠️ COMPLEXITY SKIPPED: gocyclo not installed (go install github.com/fzipp/gocyclo/cmd/gocyclo@latest)"
  else
    gocyclo -over 10 <path> 2>/dev/null | head -30
  fi
else
  echo "ℹ️ COMPLEXITY SKIPPED: no .go files in diff"
fi

For other languages: Skip complexity with explicit note: "⚠️ COMPLEXITY SKIPPED: No analyzer for "

Interpret results:

Score	Rating	Action
A (1-5)	Simple	Good
B (6-10)	Moderate	OK
C (11-20)	Complex	Flag for council
D (21-30)	Very complex	Recommend refactor
F (31+)	Untestable	Must refactor

Include complexity findings in council context.

Step 2.3: Load Domain-Specific Checklists

Detect code patterns in the target files and load matching domain-specific checklists from standards/references/:

Trigger	Checklist	Detection
SQL/ORM code	`sql-safety-checklist.md`	Files contain SQL queries, ORM imports (`database/sql`, `sqlalchemy`, `prisma`, `activerecord`, `gorm`, `knex`), or migration files in changeset
LLM/AI code	`llm-trust-boundary-checklist.md`	Files import `anthropic`, `openai`, `google.generativeai`, or match `llm`, `prompt`, `completion` patterns
Concurrent code	`race-condition-checklist.md`	Files use goroutines, `threading`, `asyncio`, `multiprocessing`, `sync.Mutex`, `concurrent.futures`, or shared file I/O patterns
Codex skills	`codex-skill.md`	Files under `skills-codex/`, or files matching `codexSKILL.md`, `convert.sh`, `skills-codex-overrides/`, or converter scripts

For each matched checklist, load it via the Read tool and include relevant items in the council packet as context.domain_checklists. Multiple checklists can be loaded simultaneously.

Skip silently if no patterns match. This step runs in both --quick and full modes (domain checklists are cheap to load and high-value).

Step 2a: Prior Findings Check

Skip if --quick. Load prior findings from .agents/findings/registry.jsonl.

Step 2b: Constraint Tests

Skip if --quick. Run compiled constraint tests from .agents/constraints/.

Step 2c: Metadata Checks

Skip if --quick. Verify file metadata consistency.

Step 2.5: OL Validation

Skip if --quick. Run organizational-lint checks.

Step 2d: Knowledge Search

Skip if --quick. Search for relevant prior learnings via ao lookup.

Step 2e: Bug Hunt

Skip if --quick. Run proactive bug-hunt audit on target files.

Step 2f: Codex Review

Step 3: Product Context

Step 2g: Test Pyramid Inventory (MANDATORY)

Assess test coverage against the test pyramid standard (the test pyramid standard (loaded via /standards)).

Read skills/vibe/references/test-pyramid-weighting.md for test pyramid weighting — L3+ tests found all production bugs, weight them 5x.

Test Pyramid Weighting: Weight test coverage by level: L0–L1 at 1x, L2 at 3x, L3+ at 5x. Unit-only coverage is a WARN signal, not a PASS. See references/test-pyramid-weighting.md.

Run even in --quick mode — this is cheap (file existence checks) and high-signal.

Identify changed modules from git diff or target scope
For each changed module, check coverage pyramid (L0–L3):
- L0: Does a contract/spec enforcement test cover this module?
- L1: Does a unit test file exist for this module?
- L2: If module crosses boundaries, does an integration test exist?
For boundary-touching code, check bug-finding pyramid (BF1–BF5):
- BF4 (Chaos): Do external call sites have failure injection tests?
- BF1 (Property): Do data transformations have property tests?
- BF2 (Golden): Do output generators have golden file tests?
Compute weighted pyramid score for changed code paths:

Formula:
```
weighted_score = (L0_count x 1 + L1_count x 1 + L2_count x 3 + L3_count x 5 + L4_count x 5) / max_possible
```
Where max_possible = total_test_count x 5 (the score if every test were L3+).

Count tests at each level for changed code paths:
- L0: Build/compile checks (weight 1)
- L1: Unit tests (weight 1)
- L2: Integration tests (weight 3)
- L3: E2E/system tests (weight 5)
- L4: Smoke/fresh-context tests (weight 5)
Interpretation:
- weighted_score >= 0.6 — strong pyramid, L2+ tests present
- 0.3 <= weighted_score < 0.6 — acceptable, but recommend more integration tests
- weighted_score < 0.3 AND all tests are L0-L1 only — WARN: unit-only test coverage (feeds into vibe verdict as a WARN signal, not a separate gate)
Satisfaction exposure: The weighted_score is also exposed as satisfaction_score (with source "test-pyramid-weighted") in the test_pyramid output block AND promoted to the top-level verdict JSON as satisfaction_score (verdict schema field, skills/council/schemas/verdict.json: number 0.0-1.0, "Probabilistic satisfaction score (0.0 = unsatisfied, 1.0 = fully satisfied). Optional — absent means not computed."). Downstream consumers (e.g., /validation STEP 1.8 holdout evaluation) can use satisfaction_score as a normalized quality signal.

Include in council packet and vibe report output:
```
## Test Pyramid Score
| Level | Count | Weight | Contribution |
|-------|-------|--------|--------------|
| L0    | 2     | 1x     | 2            |
| L1    | 8     | 1x     | 8            |
| L2    | 0     | 3x     | 0            |
| L3    | 0     | 5x     | 0            |
| L4    | 0     | 5x     | 0            |
| **Total** | **10** | | **10 / 50 = 0.20** |
WARN: weighted_score 0.20 < 0.3 and all tests are L0-L1 only
```
Build coverage table and include in council packet as context.test_pyramid:

"test_pyramid": {
  "coverage": {
    "L0": {"status": "pass", "files": ["test_spec_enforcement.py"]},
    "L1": {"status": "pass", "files": ["test_module.py"]},
    "L2": {"status": "gap", "reason": "crosses subsystem boundary, no integration test"}
  },
  "bug_finding": {
    "BF4_chaos": {"status": "gap", "reason": "external API calls without failure injection"},
    "BF1_property": {"status": "na", "reason": "no data transformations in scope"}
  },
  "weighted_score": 0.20,
  "satisfaction_score": 0.20,
  "satisfaction_source": "test-pyramid-weighted",
  "score_breakdown": {"L0": 2, "L1": 8, "L2": 0, "L3": 0, "L4": 0},
  "max_possible": 50,
  "warn_unit_only": true,
  "verdict": "WARN: weighted_score 0.20 < 0.3, all tests L0-L1 only"
}

Verdict rules:

weighted_score < 0.3 AND all tests L0-L1 only — WARN: unit-only coverage (include in council findings)
Missing L1 on feature code — WARN (include in council findings)
Missing L0 on spec-changing code — WARN
Missing BF4 on boundary code — WARN (advisory, not blocking)
All levels covered with weighted_score >= 0.6 — no mention needed

When coverage gaps are found, run /test <module> to generate test candidates for uncovered code.

Step 4: Run Council Validation

With spec found — use code-review preset:

/council --preset=code-review validate <target>

error-paths: Trace every error handling path. What's uncaught? What fails silently?
api-surface: Review every public interface. Is the contract clear? Breaking changes?
spec-compliance: Compare implementation against the spec. What's missing? What diverges?

The spec content is injected into the council packet context so the spec-compliance judge can compare implementation against it.

Without spec — 2 independent judges (no perspectives):

/council validate <target>

2 independent judges (no perspective labels). Use --deep for 3 judges on high-stakes reviews. Override with --quick (inline single-agent check) or --mixed (cross-vendor with Codex).

Council receives:

Files to review
Complexity hotspots (from Step 2)
Git diff context
Spec content (when found, in context.spec)
Sweep manifest (when --deep or --sweep, in context.sweep_manifest — judges shift to adjudication mode, see references/deep-audit-protocol.md)

Step 5: Council Checks

Each judge reviews for:

Aspect	What to Look For
Correctness	Does code do what it claims?
Security	Injection, auth issues, secrets
Edge Cases	Null handling, boundaries, errors
Quality	Dead code, duplication, clarity
Complexity	High cyclomatic scores, deep nesting
Architecture	Coupling, abstractions, patterns

Step 6: Interpret Verdict

Council Verdict:

Council Verdict	Vibe Result	Action
PASS	Ready to ship	Merge/deploy
WARN	Review concerns	Address or accept risk
FAIL	Not ready	Fix issues

Step 7: Write Vibe Report

Write to: .agents/council/YYYY-MM-DD-vibe-<target>.md (use date +%Y-%m-%d)

Step 8: Report to User

Tell the user:

Complexity hotspots (if any)
Council verdict (PASS/WARN/FAIL)
Key concerns
Location of vibe report

Step 9: Record Ratchet Progress

After council verdict:

If verdict is PASS or WARN:
- Run: ao ratchet record vibe --output "<report-path>" 2>/dev/null || true
- Suggest: "Run /post-mortem to capture learnings and complete the cycle."

If verdict is FAIL:

Do NOT record ratchet progress.

Extract ALL findings from the council report for structured retry context (group by category if >20):

Read the council report. For each finding, format as:
FINDING: <description> | FIX: <fix or recommendation> | REF: <ref or location>

Fallback for v1 findings (no fix/why/ref fields):
  fix = finding.fix || finding.recommendation || "No fix specified"
  ref = finding.ref || finding.location || "No reference"

Tell user to fix issues and re-run /vibe, including the formatted findings as actionable guidance.

Step 9.5: Feed Findings to Flywheel

If verdict is WARN or FAIL, persist reusable findings to .agents/findings/registry.jsonl and optionally mirror the broader narrative to a learning file.

Registry write rules:

persist only reusable issues that should change future review or implementation behavior
require dedup_key, provenance, pattern, detection_question, checklist_item, applicable_when, and confidence
applicable_when must use the controlled vocabulary from the finding-registry contract
append or merge by dedup_key
use the contract's temp-file-plus-rename atomic write rule

If a broader prose summary still helps, also write the existing anti-pattern learning file to .agents/learnings/YYYY-MM-DD-vibe-<target>.md. Skip both if verdict is PASS.

After the registry update, if hooks/finding-compiler.sh exists, run:

bash hooks/finding-compiler.sh --quiet 2>/dev/null || true

This keeps the same-session post-mortem path synchronized with the latest reusable findings. session-end-maintenance.sh remains the idempotent backstop.

Step 10: Test Bead Cleanup

After validation completes, clean up stale test beads (bd list --status=open | grep -iE "test bead|test quest") via bd close to prevent bead pollution. Skip if bd unavailable.

Integration with Workflow

/implement issue-123
    │
    ▼
(coding, quick lint/test as you go)
    │
    ▼
/vibe                      ← You are here
    │
    ├── Complexity analysis (find hotspots)
    ├── Bug hunt audit (find concrete bugs)
    └── Council validation (multi-model judgment)
    │
    ├── PASS → ship it
    ├── WARN → review, then ship or fix
    └── FAIL → fix, re-run /vibe

Examples

User says: "Run a quick validation on the latest changes."

Do:

/vibe recent

Validate Recent Changes

/vibe recent

Runs complexity on recent changes, then council reviews.

Validate Specific Directory

/vibe src/auth/

Complexity + council on auth directory.

Deep Review

/vibe --deep recent

Complexity + 3 judges for thorough review.

Cross-Vendor Consensus

/vibe --mixed recent

Complexity + Claude + Codex judges.

See references/examples.md for additional examples: security audit with spec compliance, developer-experience code review with PRODUCT.md, and fast inline checks.

Troubleshooting

Problem	Cause	Solution
"COMPLEXITY SKIPPED: radon not installed"	Python complexity analyzer missing	Install with `pip install radon` or skip complexity (council still runs).
"COMPLEXITY SKIPPED: gocyclo not installed"	Go complexity analyzer missing	Install with `go install github.com/fzipp/gocyclo/cmd/gocyclo@latest` or skip.
Vibe returns PASS but constraint tests fail	Council LLMs miss mechanical violations	Check `.agents/council/<timestamp>-vibe-*.md` for constraint test results. Failed constraints override council PASS. Fix violations and re-run.
Codex review skipped	`--mixed` not passed, Codex CLI not on PATH, or no uncommitted changes	Codex review is opt-in — pass `--mixed` to enable. Also requires Codex CLI on PATH and uncommitted changes.
"No modified files detected"	Clean working tree, no recent commits	Make changes or specify target path explicitly: `/vibe src/auth/`.
Spec-compliance judge not spawned	No spec found in beads/plans	Reference bead ID in commit message or create plan doc in `.agents/plans/`. Without spec, vibe uses 2 independent judges (3 with `--deep`).

Write-Time Quality Hook

The hooks/write-time-quality.sh PostToolUse hook runs automatically after every Write/Edit tool call, catching common anti-patterns at edit time rather than review time. It checks:

Go: unchecked errors, fmt.Print in library code
Python: bare except:, eval/exec, missing type hints on public functions
Shell: missing set -euo pipefail, unquoted variables

The hook is non-blocking (always exits 0) and outputs warnings via JSON. See references/write-time-quality.md for the full design.

vibe

Install

Tool Access

Preview

Supporting Assets

SKILL.md

Similar Skills

vibe

Install

Tool Access

Preview

Supporting Assets

SKILL.md

Vibe Skill

Quick Start

Execution Steps

Step 0: Load Prior Review Context

Crank Checkpoint Detection

Step 1: Determine Target

Step 1.5a: Structured Verification Path (--structured mode)

Step 1.5: Fast Path (--quick mode)

Step 2: Run Complexity Analysis

Step 2.3: Load Domain-Specific Checklists

Step 2a: Prior Findings Check

Step 2b: Constraint Tests

Step 2c: Metadata Checks

Step 2.5: OL Validation

Step 2d: Knowledge Search

Step 2e: Bug Hunt

Step 2f: Codex Review

Step 3: Product Context

Step 2g: Test Pyramid Inventory (MANDATORY)

Step 4: Run Council Validation

Step 5: Council Checks

Step 6: Interpret Verdict

Council Verdict:

Step 7: Write Vibe Report

Step 8: Report to User

Step 9: Record Ratchet Progress

Step 9.5: Feed Findings to Flywheel

Step 10: Test Bead Cleanup

Integration with Workflow

Examples

Validate Recent Changes

Validate Specific Directory

Deep Review

Cross-Vendor Consensus

Troubleshooting

Write-Time Quality Hook

See Also

Reference Documents

Similar Skills

Vibe Skill

Quick Start

Execution Steps

Step 0: Load Prior Review Context

Crank Checkpoint Detection

Step 1: Determine Target

Step 1.5a: Structured Verification Path (--structured mode)

Step 1.5: Fast Path (--quick mode)

Step 2: Run Complexity Analysis

Step 2.3: Load Domain-Specific Checklists

Step 2a: Prior Findings Check

Step 2b: Constraint Tests

Step 2c: Metadata Checks

Step 2.5: OL Validation

Step 2d: Knowledge Search

Step 2e: Bug Hunt

Step 2f: Codex Review

Step 3: Product Context

Step 2g: Test Pyramid Inventory (MANDATORY)

Step 4: Run Council Validation

Step 5: Council Checks

Step 6: Interpret Verdict

Council Verdict:

Step 7: Write Vibe Report

Step 8: Report to User

Step 9: Record Ratchet Progress

Step 9.5: Feed Findings to Flywheel

Step 10: Test Bead Cleanup