Autonomous spec-driven development loop for Claude Code
npx claudepluginhub lucasduys/forgeAutonomous spec-driven development loop for Claude Code with behavioral guardrails, knowledge graph integration, and design system support.
Official prompts.chat marketplace - AI prompts, skills, and tools for Claude Code
Behavioral guidelines to reduce common LLM coding mistakes, derived from Andrej Karpathy's observations
Claude Code plugins for the Slidev presentation framework
Claude Code is powerful, but for non-trivial features you become the glue: prompting, reviewing, re-prompting, losing context, starting over. A 12-task feature takes dozens of manual exchanges and multiple sessions.
Forge replaces that entire loop with three commands.
/forge brainstorm "your feature idea"
/forge plan
/forge execute --autonomy full
No npm install, no build step, no dependencies. Requires Claude Code v1.0.33+.
claude plugin marketplace add LucasDuys/forge
claude plugin install forge@forge-marketplace
Forge runs three nested loops. Each has its own circuit breakers and progression logic.
Outer loop: Phase progression -- Controls which spec is active and which phase runs next. Phases: idle > executing > reviewing_branch > verifying > idle. Driven by the stop hook state machine.
Middle loop: Task progression -- Within a spec, tasks advance through the dependency DAG. Streaming topological dispatch: tasks start the instant their specific dependencies complete, not when the entire tier finishes. 20-40% faster than tier-gated waves.
Inner loop: Quality iteration -- Each task cycles through implement > test > fix (max 3) > debug > Codex rescue > redecompose > blocked. Circuit breakers at every transition prevent infinite loops.
The stop hook (hooks/stop-hook.sh) intercepts every Claude exit. It reads state from .forge/.forge-loop.json, calls routeDecision() in forge-tools.cjs (a 200+ line state machine), and either blocks exit with the next prompt or allows it. Claude never needs a human to tell it what to do next.
Claude acts > attempts exit > stop hook fires > routeDecision() > block with next prompt > repeat
Completion signal: Claude outputs <promise>FORGE_COMPLETE</promise> only when all tasks are complete and verified. The hook detects it, generates a summary, deletes the loop file, and allows exit.
/forge execute
|
LOAD plan DAG + artifact contracts
|
STREAMING SCHEDULER -----> picks tasks whose deps are satisfied
| scores complexity (0-20)
| routes to haiku / sonnet / opus
| assembles context bundle
|
+---> RESEARCHER: deep research (official docs, papers, codebase conventions)
| |
+---> EXECUTOR: implement + test (TDD at thorough depth)
| |
| REVIEWER: spec compliance + blast radius + conventions
| |
| (optional) CODEX REVIEW: adversarial cross-model check
| |
| ARTIFACT WRITE: structured output for downstream consumers
| |
| +---> Pass: atomic commit, unlock dependents
| +---> Fail: debug > Codex rescue > re-decompose > block
|
+---> CONTEXT MONITOR: save handoff at 60%, resume in new session
|
v
VERIFIER: goal-backward verification (existence > substantive > wired > runtime)
|
DONE: all tasks committed, branch ready
The forge-researcher agent investigates before the executor touches code. Dispatched for unfamiliar tech, security-sensitive code, and external integrations. Follows a tiered source hierarchy:
| Tier | Source | Trust |
|---|---|---|
| 1 | Official docs (Context7, WebFetch) | Highest |
| 2 | Peer-reviewed papers, RFCs (Semantic Scholar) | High |
| 3 | Vendor blogs (Anthropic, Stripe, Vercel) | Medium |
| 4 | Community (GitHub discussions, Stack Overflow) | Medium |
| 5 | Blog posts (only with Tier 1-3 corroboration) | Low |