claude-pilot ⭐
Claude Code workflow engine - SPEC-first planning, TDD automation, GPT delegation, and context engineering. From idea to production, discipline included.
Autonomous agents. TDD-driven. Documentation sync.

💡 Why claude-pilot?
Claude Code is powerful, but unstructured. claude-pilot adds discipline:
- ❌ Vague prompts → ✅ PRP pattern (What, Why, How, Success Criteria)
- ❌ Manual iteration → ✅ Ralph Loop (autonomous TDD until tests pass)
- ❌ Context bloat → ✅ 3-Tier docs (optimized token usage)
- ❌ Stuck on bugs → ✅ GPT delegation (fresh perspective after 2nd failure)
- ❌ Documentation drift → ✅ Auto-sync (docs stay in sync with code)
Result: Higher quality code, faster iteration, happier team.
Key Benefits:
- ✅ SPEC-First: Requirements before code
- 🤖 Autonomous: Ralph Loop runs until tests pass
- 🔄 Persistent: Plan state saved to
.pilot/ directories
- 📚 Documented: Auto-sync 3-tier documentation
- 🧠 Intelligent: GPT Codex delegation for complex problems
Quick Start
# Add marketplace (use #release branch)
/plugin marketplace add changoo89/claude-pilot#release
# Install plugin
/plugin install claude-pilot
# Initial setup
/pilot:setup
⭐ Star this repo if it helps your workflow!
What is claude-pilot?
claude-pilot is a Claude Code plugin that brings structure and discipline to AI-assisted development.
Core Features
- SPEC-First TDD: Test-Driven Development with clear success criteria
- Ralph Loop: Autonomous iteration until all tests pass
- 3-Tier Documentation: Foundation/Component/Feature hierarchy for efficient context
- PRP Pattern: Structured prompts for unambiguous requirements
- Pre-commit Hook: JSON validation and markdown link check
- Pure Plugin: No Python dependency, native Claude Code integration
GPT Codex Integration
Intelligent GPT Delegation: Context-aware, autonomous delegation to GPT experts
Expert Mapping
| Situation | GPT Expert |
|---|
| Architecture decisions | Architect |
| Security-related code | Security Analyst |
| Large plans (5+ SCs) | Plan Reviewer |
| 2+ failed fix attempts | Architect |
| Coder blocked (automatic) | Architect |
Delegation Triggers
- Explicit: "ask GPT", "review architecture"
- Semantic: Architecture decisions, security issues, ambiguity
- Automatic: After 2nd failure (not first) → Progressive escalation
- Self-Assessment: Agent confidence scoring < 0.5 → Delegation
Configuration
export CODEX_REASONING_EFFORT="medium" # low | medium | high | xhigh
Full Guide: docs/ai-context/codex-integration.md
claude-pilot vs Vanilla Claude Code
| Feature | claude-pilot | Vanilla Claude Code |
|---|
| SPEC-First Planning | ✅ PRP format, success criteria | ❌ Ad-hoc planning |
| TDD Automation | ✅ Ralph Loop autonomous | ❌ Manual test-run cycle |
| Session Persistence | ✅ Plan file persistence | ❌ Context lost on exit |
| Documentation Sync | ✅ 3-tier auto-update | ❌ Manual docs only |
| Quality Gates | ✅ Type check, lint, coverage | ❌ No enforcement |
| GPT Delegation | ✅ Intelligent escalation | ❌ Manual delegation |
| Multi-Angle Review | ✅ Parallel verification | ❌ Single perspective |
Core Workflow
/00_plan → Create spec-driven plan with PRP format
/01_confirm → Review and approve plan
/02_execute → Execute with Ralph Loop + TDD
/03_close → Complete and commit
/review → Auto-review code (multi-angle)
/document → Auto-document changes
/pilot:setup → Configure MCP servers
Project Structure