Audit a service or module for agent-friendliness — identify confusing boundaries, tightly coupled code, and shallow modules that make AI implementation harder and less reliable
From claude-toolkitnpx claudepluginhub johwer/marketplace --plugin claude-toolkitThis skill uses the workspace's default tool permissions.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Refactor codebases for clarity and testability — especially to make them more reliable for AI-assisted implementation. Tightly coupled, shallow, or confusingly-bounded modules produce inconsistent agent outputs. This audit identifies and fixes those.
$ARGUMENTS
If a specific service or directory is provided, audit that. Otherwise ask: "Which service or area do you want to audit?"
Works on both:
apps/web/src/services/<Service>/Read the directory structure and key files to understand what exists:
# Frontend
find apps/web/src/<area> -type f | head -60
# Backend
find services/<Service>/<Service>/ -type f -name "*.cs" | head -60
For each module/file, note:
Symptoms:
Fixes:
Symptoms (from John Ousterhout's A Philosophy of Software Design):
Fixes:
Symptoms:
Frontend fixes:
Backend fixes:
Symptoms:
Fixes:
Score each finding:
| Priority | Criteria |
|---|---|
| P1 | Blocks testability — can't write a unit test without this fix |
| P2 | Makes agent implementation unreliable — AI makes wrong assumptions from confusing names/boundaries |
| P3 | Reduces readability — humans struggle, but not a blocker |
| P4 | Style/preference — worth fixing during normal feature work |
Only recommend implementing P1 and P2 findings immediately. P3/P4 go into tech debt backlog.
For each P1/P2 finding, produce a concrete refactor:
FINDING: [name]
Location: [file:line]
Problem: [one sentence]
Fix: [what to do]
Impact: [what becomes easier after this fix]
Risk: [what could break — migration path if needed]
Test: [how to verify the fix didn't change behavior]
If there are 3 or more P1/P2 findings, suggest running /request-refactor-plan before touching any code:
"There are [N] high-priority findings. Want me to run
/request-refactor-planfirst? It turns this into a structured tiny-commit plan filed as a Jira ticket — safer to implement incrementally than all at once. Or say 'just do it' to start implementing now."
If the user confirms, hand off to /request-refactor-plan with the audit findings as context.
For approved refactors:
# Frontend
cd apps/web && npm run type-check && npm run test
# Backend
cd services/<Service>/<Service>.Test && dotnet test
No new test failures. No new type errors.
After the audit, score the module against this checklist. Low scores = high refactor priority.
Frontend smells:
*.tsx files (should be in hooks/)useSelector patterns that could be a custom hookBackend smells:
Controller calling Repository directly (skipping service layer)Service importing concrete Repository instead of IRepositoryDbContext OnModelCreating that belongs in a service