From evanflow
Runs iterative self-review loop after code implementation: executes project quality checks (lint/test/typecheck), reviews diffs for dead code/naming/weak tests/failure modes, fixes until clean. UI visual verification included.
npx claudepluginhub evanklem/evanflow --plugin evanflowThis skill uses the workspace's default tool permissions.
See `evanflow` meta-skill. Key terms: **deep modules**, **deletion test**, **vertical slice**.
Final code review skill: runs stack-specific tests/lints (Next.js, Python, Swift, Kotlin), security checks, verifies spec.md criteria, audits hub files, issues ship/no-go verdict after /build or /deploy.
Reviews and verifies code before merge via triage-first checks (up to 16 parallel agents). Pipeline mode verifies vs plans; general mode for PRs/branches/staged changes. Flags findings only.
Reviews code for bugs, bad patterns, security issues, performance problems, correctness, and untested code. Reports findings and delegates to fix, test, sentinel, or other skills.
Share bugs, ideas, or general feedback.
See evanflow meta-skill. Key terms: deep modules, deletion test, vertical slice.
evanflow-executing-plans finishes all tasksSKIP when: the change is one line or trivially correct.
Repeat until stopping condition met:
Run the project's quality checks — exact commands are project-specific (see CLAUDE.md or the project's README). Typical examples across stacks:
# typecheck — one of:
tsc --noEmit # TypeScript
pnpm typecheck # if scripted
cargo check # Rust
go vet ./... # Go
# lint — one of:
pnpm lint
eslint .
cargo clippy
ruff check .
# test — one of:
pnpm test
pytest
cargo test
go test ./...
If any check fails: fix and restart the loop. Don't proceed to step 2 with broken checks.
git diff # working-tree changes
git diff HEAD~N..HEAD # if reviewing a series of past commits
For each changed file, look critically for:
evanflow-glossary.)any, as, @ts-ignore? Justified?authenticatedProcedure where needed? Resource ownership re-derived from ctx.user? Per CLAUDE.md.Fix what you find. Then restart from step 1.
Industry research identifies five predictable failure modes in agentic coding. After step 2's diff review, do an explicit pass against each:
process.env.STRIPE_SECRET_KEY reference when the actual var name is STRIPE_SK.)CONTEXT.md? Names, conventions, invariants?For each mode flagged, fix and restart from step 1.
If the diff touches frontend page or component files and the change has visible output:
Default approach (no Playwright needed):
# Make sure your dev server is running first (e.g., pnpm dev, npm run dev, etc.)
chromium --headless --no-sandbox \
--screenshot=/tmp/iter-$(date +%s).png \
--window-size=1440,900 \
http://localhost:<port>/<route>
(If your project doesn't have chromium, substitute google-chrome --headless or chrome --headless with the same flags.)
Then read the screenshot:
Read /tmp/iter-*.png
Check against:
--window-size=390,844 (mobile)If you need interaction (click, fill, observe modal): use Playwright MCP. If MCP fails with "chrome not found", configure it to use your installed Chromium binary by adding "--executable-path", "/path/to/chromium" to args in the Playwright .mcp.json. Don't fight the MCP — fix it once, then use it.
Stop the loop when all are true:
Hard cap: 5 iterations. If you're still finding issues at iteration 5, the original plan was wrong — stop and ask the user. Don't iterate forever.
evanflow-writing-plans (plan was wrong)evanflow-improve-architectureevanflow-debug