npx claudepluginhub tody-agent/codymaster --plugin cmThis skill uses the workspace's default tool permissions.
Enables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Designs and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Running commands without checking output is flying blind. Users MUST see what's happening at every step.
Core principle: Every command gets announced, monitored, and verified. No exceptions.
Violating the letter of this rule is violating the spirit of this rule.
NO COMMAND RUNS WITHOUT READING ITS OUTPUT.
NO ERROR GOES UNREPORTED.
NO NEXT STEP WITHOUT PREVIOUS STEP CONFIRMED.
ALWAYS when running terminal commands via run_command. This includes:
npm run build, npm run dev)npx vitest run)npm install)npx wrangler pages deploy)git push, git commit)BEFORE calling run_command:
Update task_boundary TaskStatus to describe what you're about to run and why.
TaskStatus: "Running npm build to compile production bundle"
TaskStatus: "Installing dependencies with npm install"
TaskStatus: "Deploying to Cloudflare Pages"
Choose WaitMsBeforeAsync based on expected command duration:
| Command Type | WaitMsBeforeAsync | Strategy |
|---|---|---|
Quick (< 3s) — git status, ls, cat | 3000-5000 | Synchronous, read output directly |
Medium (3-30s) — npm install, build | 5000-10000 | Wait for initial output, then poll |
Long (> 30s) — deploy, test suites | 2000-5000 | Send to background, poll actively |
After run_command returns:
command_status immediately with WaitDurationSeconds: 10Scan output for error indicators:
ERROR PATTERNS TO DETECT:
- Exit code ≠ 0
- "error", "Error", "ERROR"
- "fail", "FAIL", "failed", "FAILED"
- "ENOENT", "EACCES", "EPERM"
- "not found", "No such file"
- "Cannot find module"
- "SyntaxError", "TypeError", "ReferenceError"
- "Build failed"
- "Command failed"
- "Permission denied"
- "FATAL"
- Stack traces (lines with "at " prefix)
- npm ERR!
- Warning patterns that indicate real problems
If ANY error is detected:
1. STOP — Do not run any more commands
2. IDENTIFY — Extract the exact error message and context
3. REPORT — Call notify_user with error if critical
4. FIX — Use cm-debugging if proceeding to fix
For background commands (returned a CommandId):
command_status every 10-15 secondstask_boundary TaskStatus with latest output summaryOnly after reading output AND confirming no errors:
Update task_boundary TaskSummary with the result:
TaskSummary: "Build completed successfully (0 errors, 0 warnings)"
TaskSummary: "All 519 tests passed"
TaskSummary: "Deployed to https://prms-4pv.pages.dev successfully"
If you catch yourself doing ANY of these:
run_command while previous command is still runningcommand_status for a background commandALL of these mean: STOP. Follow the protocol.
| DON'T | DO |
|---|---|
| Run and forget | Run and read output |
| Assume success | Verify success from output |
| Chain commands blindly | Verify each before next |
| Ignore warnings | Report warnings to user |
| Multiple commands without checking | Sequential with verification |
WaitMsBeforeAsync: 3000-5000If running multiple commands at once:
&&, |)cd dir && npm run build — if cd fails, build won't run| Level | Action | Example |
|---|---|---|
| 🟢 Success | Update TaskSummary, proceed | "Build succeeded" |
| 🟡 Warning | Report to user, ask if proceed | "Deprecated dependency" |
| 🔴 Error | STOP immediately, notify_user or fix | "Build failed", "Test failed" |
| ⚫ Fatal | STOP immediately, notify_user | "ENOENT", "Permission denied" |
Every command tells a story through its output. READ the story. SHARE it with the user. STOP if the story is bad.