Help us improve
Share bugs, ideas, or general feedback.
From codegen-bridge
Use when debugging failed, stuck, or misbehaving Codegen agent runs — systematic 4-phase approach. Triggers on agent run failures, timeouts, or unexpected results. Analogous to superpowers systematic-debugging but for cloud agents.
npx claudepluginhub evgenygurin/codegen-bridge --plugin codegen-bridgeHow this skill is triggered — by the user, by Claude, or both
Slash command
/codegen-bridge:debugging-failed-runsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
When a Codegen agent run fails, gets stuck, or produces wrong results — do NOT immediately retry with the same prompt. Systematically diagnose the root cause first.
Provides structured self-debugging workflow for AI agent failures: capture state, diagnose patterns, apply contained recoveries, generate introspection reports. For loops, retries without progress, context drift.
Use when repeated fix attempts fail, the agent appears stuck in a loop, or complexity is increasing without progress
Runs parallel AI agent investigations on bugs and produces a consolidated debug report. Use for 'debug', 'find bug', or 'problem analysis' requests.
Share bugs, ideas, or general feedback.
When a Codegen agent run fails, gets stuck, or produces wrong results — do NOT immediately retry with the same prompt. Systematically diagnose the root cause first.
Iron Law: NO RETRY WITHOUT DIAGNOSIS.
Retrying a failed run with the same prompt wastes time and credits. Understand WHY it failed, fix the cause, THEN retry.
Announce at start: "I'm using the debugging-failed-runs skill to diagnose this agent failure."
failedrunning for >10 minutes without progressDO NOT GUESS. Read the logs first.
codegen_get_run(run_id=<id>)
Record: status, duration, error message (if any), PR links.
codegen_get_logs(run_id=<id>, limit=50, reverse=false)
Read chronologically (oldest first). Look for:
| Signal | What it means |
|---|---|
Last thought before failure | Agent's final reasoning — often reveals the actual problem |
| Repeated tool calls to same file | Agent stuck in a loop |
Bash with test commands + failures | Tests failing — read the test output |
Bash with git commands + errors | Branch/merge conflicts |
| No logs at all | Sandbox setup failure — infrastructure issue |
| Logs stop abruptly | Timeout or crash |
codegen_check_integration_health
Rule out infrastructure issues: GitHub app connected? Webhooks working? API accessible?
Based on evidence from Phase 1, classify into one of these categories:
Symptoms: Agent did the wrong thing, misunderstood the task, worked on wrong files.
Evidence: Agent thoughts show confusion about requirements. Tool calls target unexpected files.
Fix: Rewrite the prompt with clearer instructions. See prompt-crafting skill.
Symptoms: Agent wrote code but tests fail. Build errors in logs.
Evidence: Bash tool calls show test/build output with errors.
Fix: Include test expectations in prompt. Add "if tests fail, read the error and fix" instruction. Ensure dependencies are available in sandbox.
Symptoms: No logs, immediate failure, permission errors, git clone failures.
Evidence: codegen_check_integration_health shows issues. Logs show auth errors or missing repos.
Fix: Fix integration issues first. Check: repo registered? GitHub app installed? API key valid?
Symptoms: Agent times out. Logs show extensive file reading. Agent tries to refactor everything.
Evidence: Duration >10 min. Hundreds of log entries. Agent touches many unrelated files.
Fix: Break task into smaller pieces. Add "Do NOT modify files outside: [list]" constraint.
Symptoms: Agent repeats same action. Makes a change, reverts it, makes it again.
Evidence: Logs show cyclic pattern: edit → test → fail → revert → edit (same thing).
Fix: Provide explicit fix direction. "The error is X. Fix it by doing Y, not Z."
Symptoms: Git operations fail. Agent can't push. Merge conflicts.
Evidence: Git error messages in logs.
Fix: Create a fresh run from updated main branch. Don't resume — start clean.
Based on classification:
prompt-crafting skillcodegen_resume_run(run_id, prompt="The test fails because X. Fix by Y.")codegen_check_integration_healthexecuting-via-codegen or bulk-delegation for the subtaskscodegen_resume_run(run_id, prompt="Stop trying X. Instead, do Y because Z.")codegen_stop_run, create fresh run with explicit approachcodegen_stop_run if still runningcodegen_get_run(run_id)| Don't | Do Instead |
|---|---|
| Retry immediately with same prompt | Diagnose first |
| Resume without reading logs | Always read logs before resuming |
| Blame "the agent" generically | Identify specific failure category |
| Create 5 retries in a row | Max 2 retries, then rethink |
| Ignore timeout as "flaky" | Timeout = scope too large or loop |
| Add more text to prompt hoping it helps | Identify what's MISSING, add only that |
If 2 retries fail with different errors each time, the task is likely:
Report honestly: "This task failed twice with different errors. I recommend [splitting it / doing it locally / providing more context]."