From implement-cli
CLI-based implementation lifecycle using the Python Agent SDK. Orchestrates plan, implement, review/address loop, docs gate, verify, and merge via claude-agent-sdk subprocesses with native async parallelism. Triggers: /implement-cli
npx claudepluginhub benjamcalvin/bootstraps --plugin implement-cliThis skill uses the workspace's default tool permissions.
Orchestrate the full implementation lifecycle for: $ARGUMENTS
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Generates original PNG/PDF visual art via design philosophy manifestos for posters, graphics, and static designs on user request.
Orchestrate the full implementation lifecycle for: $ARGUMENTS
git branch --show-currentgit log --oneline -5You are the orchestrator. You drive the lifecycle, make referee decisions, and post to GitHub. You delegate heavy work (implementation, review, addressing) to Python Agent SDK subprocesses via the implement-cli CLI tool.
Drive forward autonomously. Execute all phases without pausing for approval between them.
All commands output JSON with a tracking section containing session IDs, costs, token usage, and the run_dir — a unique directory for each invocation that holds all artifacts (findings files, run context). This prevents collisions between concurrent runs.
Resolve the wrapper script path first — it lives at run.sh in the plugin root.
Use the Glob tool to find **/implement-cli/run.sh, then set:
CLI="<path returned by Glob>"
# Run a single agent with a prompt
"$CLI" run-agent "prompt" \
--phase implement --role implementer --cwd "$(pwd)" \
[--context-files file1.md file2.md] \
[--session-id <resume-id>] \
[--model <model>] [--tools Read Write Bash]
# Run a single agent with prompt from a file
"$CLI" run-agent --prompt-file /tmp/task-description.md \
--phase implement --role implementer --cwd "$(pwd)"
# Run parallel reviewers
"$CLI" run-reviewers --pr 45 --round 1 --cwd "$(pwd)" \
[--reviewers review-correctness review-security]
# Print a human-readable summary of a completed run
"$CLI" summary <run_dir>/run_context.json
# Debug: list sessions from a run
"$CLI" debug sessions <run_dir>/run_context.json
# Debug: inspect a specific session
"$CLI" debug session <session-id> --cwd "$(pwd)"
# Debug: resume a session with follow-up
"$CLI" debug resume <session-id> "What happened with the null check?" --cwd "$(pwd)"
# Global flags (must come BEFORE the subcommand)
# --version Print version and exit
# --dry-run Print resolved config without running agents
# --max-depth 5 Maximum recursion depth (default: 5, must be >= 1)
# --max-cost 50.0 Maximum cumulative cost in USD (default: 50.0, must be > 0)
# --verbose / -v Verbose logging
Parse $ARGUMENTS to determine what to work on and what to do.
Step 1 — Identify the target from the leading token:
#N (issue number): Fetch with gh issue view N --comments and extract task description and comment context.gh pr view N. If so, record the PR number.Step 2 — Determine scope from trailing instructions:
| Instructions | Effect |
|---|---|
| (none) | Default lifecycle: issue/freeform → Phases 1–6; PR number → Phases 4–6 |
| "just review" / "review only" | Run reviewers only, post findings, stop |
| "address the review feedback" | Run the addresser against existing findings |
| "skip planning" / "just implement" | Skip codebase exploration and planning |
Write the full task description to a file. Use a unique run directory to prevent collisions with concurrent runs:
RUN_DIR=$(mktemp -d -t implement-cli-XXXXXX)
cat > "$RUN_DIR/implement-task.md" <<'TASK'
<task description with acceptance criteria>
TASK
Invoke the implementer:
"$CLI" --verbose run-agent \
--prompt-file "$RUN_DIR/implement-task.md" \
--phase implement --role implementer --cwd "$(pwd)"
Parse the JSON output to extract:
result.session_id — record for debuggingresult.text — search for PR_NUMBER: <N> to get the PR numbertracking.total_cost_usd — monitor spendIf linked to an issue, post a progress comment:
gh issue comment <N> --body "Implementation PR created: #<pr-number>. Entering review phase."
Before Round 1: Rebase on main.
Invoke parallel reviewers:
"$CLI" --verbose run-reviewers \
--pr <PR> --round <N> --cwd "$(pwd)"
Parse JSON output — each reviewer's result is in reviewers.<name>:
session_id — for debuggingtext — the review findingscost_usd, input_tokens, output_tokens — trackingAlways check tracking.total_cost_usd after each invocation. If approaching the limit, warn and consider stopping.
You evaluate every finding. Read the relevant code yourself. For each finding:
| Decision | When to use |
|---|---|
| Accept (default) | Finding has merit — you verified by reading the code |
| Reject | Incorrect, irrelevant, or ill-considered |
If zero findings survive, post "Review Round <N>: no actionable findings" and skip to Phase 4.5.
Post referee decisions to GitHub. Write filtered findings to a file in the run directory:
cat > "$RUN_DIR/findings-round-<N>.md" <<'EOF'
<filtered findings table>
EOF
"$CLI" --verbose run-agent \
--prompt-file "$RUN_DIR/address-prompt.md" \
--phase address --role addresser --cwd "$(pwd)" \
--context-files "$RUN_DIR/findings-round-<N>.md"
If round 10+, escalate. Otherwise return to Step A.
If a reviewer or addresser produces unexpected results, use the debug commands:
# See what happened in the session
"$CLI" debug session <session-id>
# Ask a follow-up
"$CLI" debug resume <session-id> \
"Why did you flag the null check as a bug? It's guarded by the validator." \
--cwd "$(pwd)"
After the code review/address loop converges, run a documentation curation check. The docs reviewer doesn't just check docs that were changed — it proactively identifies where documentation is missing, outdated, or contradicted by the final state of the code (including all review fixes). This catches gaps like new CLI commands without usage docs, changed defaults that contradict existing guides, or architectural decisions that need ADRs.
Do NOT include review-docs in the Phase 4 reviewer pool. It runs only here, after the code review loop is complete.
"$CLI" --verbose run-reviewers \
--pr <PR> --round 1 --reviewers review-docs --cwd "$(pwd)"
Apply the same referee evaluation and address loop as Phase 4. The round counter starts from 1 (independent of Phase 4 rounds). Same 10-round escalation limit.
"$CLI" --verbose run-agent \
--prompt-file "$RUN_DIR/verify-prompt.md" \
--phase verify --role verifier --cwd "$(pwd)"
"$CLI" --verbose run-agent --prompt-file "$RUN_DIR/merge-prompt.md" \
--phase merge --role merger --cwd "$(pwd)"
Every CLI invocation returns a tracking section:
{
"tracking": {
"run_dir": "/tmp/implement-cli-abc123",
"total_cost_usd": 2.34,
"total_input_tokens": 150000,
"total_output_tokens": 25000,
"total_sessions": 5,
"max_depth_reached": 1,
"elapsed_seconds": 180.5,
"sessions": [
{
"session_id": "abc123",
"phase": "review",
"role": "review-correctness",
"cost_usd": 0.45,
"input_tokens": 30000,
"output_tokens": 5000,
"duration_ms": 36000,
"is_error": false,
"depth": 1
}
]
}
}
Run directory isolation: Each invocation creates a unique directory (shown in run_dir) for all artifacts — findings files, prompts, run context. This prevents collisions between concurrent runs. The run_context.json file inside the run directory has the full session history for debugging.
Monitor total_cost_usd throughout the lifecycle. For a human-readable overview of a completed run, use summary <run_context.json>. If costs are unexpectedly high, investigate using debug sessions and debug session commands.
Stop and flag the human when: