Audit codebases for over-engineering, enforce YAGNI principles, and generate the simplest solutions using stdlib-first approaches. Includes automated reviews of diffs and whole-repo scans for dead code and unnecessary abstractions.
Whole-repo audit for over-engineering. Like ponytail-review, but scans the entire codebase instead of a diff: a ranked list of what to delete, simplify, or replace with stdlib/native equivalents. Use when the user says "audit this codebase", "audit for over-engineering", "what can I delete from this repo", "find bloat", "ponytail-audit", or "/ponytail-audit". One-shot report, does not apply fixes.
Harvest every `ponytail:` comment in the codebase into a debt ledger, so the deliberate shortcuts and deferrals ponytail leaves behind get tracked instead of rotting into "later means never". Use when the user says "ponytail debt", "/ponytail-debt", "what did ponytail defer", "list the shortcuts", "ponytail ledger", or "what did we mark to do later". One-shot report, changes nothing.
Show ponytail's measured impact as a compact scoreboard: less code, less cost, more speed, from the benchmark medians. One-shot display, not a persistent mode, and not a per-repo number. Trigger: /ponytail-gain, "ponytail gain", "what does ponytail save", "show ponytail impact", "ponytail scoreboard".
Quick-reference card for all ponytail modes, skills, and commands. One-shot display, not a persistent mode. Trigger: /ponytail-help, "ponytail help", "what ponytail commands", "how do I use ponytail".
Code review focused exclusively on over-engineering. Finds what to delete: reinvented standard library, unneeded dependencies, speculative abstractions, dead flexibility. One line per finding: location, what to cut, what replaces it. Use when the user says "review for over-engineering", "what can we delete", "is this over-engineered", "simplify review", or invokes /ponytail-review. Complements correctness-focused review, this one only hunts complexity.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
He says nothing. He writes one line. It works.
~54% less code (up to 94%) · ~20% cheaper · ~27% faster · 100% safe
Measured on real Claude Code sessions editing a real open-source repo (FastAPI + React), against the same agent with no skill. ~54% is the mean across 12 feature tasks (Haiku 4.5, n=4); it reaches 94% where an agent over-builds (a date picker) and is near zero where the code is already minimal. ponytail keeps every safety guard while a bare "write one-liners" prompt drops one. (The earlier single-shot benchmark reported 80-94% as a flat figure; against a fair agentic baseline that is the per-task ceiling, not the average.) Full writeup · reproduce it.
You know him. Long ponytail. Oval glasses. Has been at the company longer than the version control. You show him fifty lines; he looks at them, says nothing, and replaces them with one.
Ponytail puts him inside your AI agent.
You ask for a date picker. Your agent installs flatpickr, writes a wrapper component, adds a stylesheet, and starts a discussion about timezones.
With ponytail:
<!-- ponytail: browser has one -->
<input type="date">
More survivors in examples/.
The honest measurement is a real agent doing real work: a headless Claude Code session editing tiangolo's full-stack-fastapi-template (a real FastAPI + React repo), scored on the git diff it leaves behind. Twelve feature tickets, the same agent with and without the skill, n=4, Haiku 4.5.
| vs no-skill baseline | LOC | tokens | cost | time | safe |
|---|---|---|---|---|---|
| ponytail | -54% | -22% | -20% | -27% | 100% |
| caveman (terse-prose control) | -20% | +7% | +3% | +2% | 100% |
| "YAGNI + one-liners" prompt | -33% | -14% | -21% | -30% | 95% |
ponytail is the only arm that cuts every metric, and the only one that stays fully safe while doing it. The cut is biggest where there is a real over-build trap (date picker 404 to 23 lines, color picker 287 to 23, because it reaches for a native <input> instead of a component) and near zero on code that is already minimal. Full method, per-task tables, and limitations: benchmarks/results/2026-06-18-agentic.md.
Five everyday tasks, three models, three arms (no skill, caveman, ponytail), ten runs, median reported. One prompt, one completion, counting lines of the answer:
This showed 80-94% less code. #126 fairly pointed out that the bare-model baseline pads its answer with prose and options, so that gap is partly a conversational-baseline artifact. The agentic numbers above are the corrected, defensible version. Reproduce the single-shot run with npx promptfoo eval -c benchmarks/promptfooconfig.yaml.
npx claudepluginhub dietrichgebert/ponytail --plugin ponytailAnti-over-engineering skill for AI coding agents. Teaches your AI when to stop.
Mindful AI coding framework — discipline over cleverness. Skill + 21 slash commands + 8 specialist agents + 5 runtime hooks + 15 default checklists + Master Orchestrator + Gravity hub. Works on any model tier (Opus/Sonnet/Haiku). Integrates Claude Design for visual work.
Claude Code plugin that uses skill architecture to intercept vague prompts, ask clarifying questions, and return structured framework-aware prompts that has credit saving patterns built in.
簡化和優化程式碼以提升清晰度、一致性和可維護性,同時保持原有功能不變
Code transformation: Dev SDLC orchestrator (code-shipping pipeline), plan, assert, audit, review, test, refactor, debug, for-sure. Hosts engineering agents.
Claude Code plugin channeling Taylor Otwell's Laravel philosophy