This skill should be used when auditing a codebase for AI agent readiness, or when guiding improvements to make a codebase work well with agentic coding tools. It applies when users ask to evaluate test coverage, file structure, type system usage, dev environment speed, or automated enforcement -- the five pillars that determine how effectively coding agents can operate in a project. Triggers on "audit my codebase", "make this agent-ready", "improve for AI agents", "agent-friendly", or questions about why agents struggle with a codebase.
From caspernpx claudepluginhub casper-studios/casper-marketplace --plugin casperThis skill uses the workspace's default tool permissions.
references/checklist.mdSearches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
When agents struggle with a codebase, they are reflecting and amplifying the codebase's existing weaknesses. This skill evaluates codebases against five principles that determine agent effectiveness, and provides concrete guidance to improve each one. It adapts to the project's language and stack.
Based on "AI Is Forcing Us To Write Good Code".
Determine which mode to operate in based on context:
If the mode is unclear, ask.
To audit a codebase, work through these steps:
Identify the primary language, test framework, build system, and database by examining project files (e.g. package.json, go.mod, Gemfile, pyproject.toml, Cargo.toml). This determines which tooling recommendations apply.
Read references/checklist.md for detailed criteria per principle. For each principle, determine the current state:
utils, helpers, common). Assess whether filenames communicate domain purpose.any/untyped gaps.Present findings as a table with one row per principle:
| Principle | Rating | Key Finding |
|---|---|---|
| Test Coverage | Strong / Adequate / Weak | e.g. "87% coverage, no CI enforcement" |
| File Structure | Strong / Adequate / Weak | e.g. "3 files over 500 lines, 2 catch-all utils files" |
| Types | Strong / Adequate / Weak | e.g. "Strict TS, but no API schema generation" |
| Dev Environments | Strong / Adequate / Weak | e.g. "Manual 8-step setup, no concurrent support" |
| Enforcement | Strong / Adequate / Weak | e.g. "ESLint configured but not in CI" |
Rank the weakest principles and suggest concrete next steps for the top 2-3. Each recommendation should reference the project's actual stack and tooling.
When guiding improvements to a specific principle:
references/checklist.md for the relevant sectionThe most counterintuitive principle deserves emphasis. At 100% line coverage:
checklist.md -- Detailed evaluation criteria for each of the five principles, including stack-specific tooling, key indicators (Strong/Adequate/Weak), and guidance. Load this file when performing an audit or providing detailed guidance on any principle.