From ponytail
Fully self-contained autonomous autoresearch loop with dynamic domain-aware agent teams — Boot→Think→Plan→Create→Review→Verify. Scans project, auto-generates expert team personas (Head + Thinking Team + Execution Team + Critic + Testing Agent) tailored to any domain. Teams debate internally before reporting to Head. ALL superpower skills embedded inline, zero external dependencies.
How this skill is triggered — by the user, by Claude, or both
Slash command
/ponytail:council-orchestrationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**Everything is built-in.** All 14 superpower patterns are embedded directly in this file. No external Skill calls needed. The council reads, applies, and loops autonomously until the objective is met.
Everything is built-in. All 14 superpower patterns are embedded directly in this file. No external Skill calls needed. The council reads, applies, and loops autonomously until the objective is met.
New in this version: Phase 0 PROJECT BOOT — scans the project, infers domain, generates domain-appropriate expert agent personas, and assembles a project-specific team before any work begins.
Model reference: All available models via proxy can be discovered live with council-orchestrator models.
PHASE 0 — PROJECT BOOT (runs once at session start):
Scan project → Infer domain → Generate COUNCIL_AGENTS.md
↓
MAIN LOOP (autonomous, never stop):
LOOP:
1. council-orchestrator status ← check current stage
2. Execute the stage handler ← uses embedded patterns below
3. council-orchestrator status ← verify transition
4. GOTO step 1 ← UNCONDITIONAL
BREAK ONLY when:
- Delivery check says objective satisfied → DELIVER
- __maxed_out__ safety limit → REPORT
┌────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌──────────┐ ┌──────────┐
│ BOOT │→ │ THINK │→ │ PLAN │→ │ CREATE │→ │ REVIEW │→ │ VERIFY │
│ (once) │ │ Thinking│ │Thinking+│ │Execution│ │Critic+ │ │ Testing │
│ │ │ Team │ │Execution│ │ Team │ │ Testing │ │ Agent │
└────────┘ └────┬────┘ └────┬────┘ └────┬────┘ └────┬─────┘ └────┬─────┘
│ │ │ │ │
◄─────────────┴─────────────┴─────────────┴──────────────┘
│ loop back via loopback if teams find issues │
└───────────────────────────────────────────────────────────┘
│ if !satisfied → next-iteration → GOTO top
└──────────────────────────────────────────┘
┌──────────────────┐
│ HEAD AGENT │ ← Orchestrator. Domain's lead coordinator.
│ (1 agent) │ Receives reports, routes tasks, approves/rejects.
└────────┬─────────┘
┌─────────────────┼──────────────────┬──────────────────┐
▼ ▼ ▼ ▼
┌─────────────────┐ ┌───────────────┐ ┌──────────────┐ ┌────────────────┐
│ THINKING TEAM │ │EXECUTION TEAM │ │ CRITIC AGENT │ │ TESTING AGENT │
│ (2+ agents) │ │ (2+ agents) │ │ (1 agent) │ │ (1 agent) │
│ │ │ │ │ │ │ │
│ Domain thinkers │ │Domain builders│ │Domain external│ │Domain verifier │
│ & strategists │ │& implementors │ │ challenger │ │& quality gate │
└─────────────────┘ └───────────────┘ └──────────────┘ └────────────────┘
All titles are DYNAMIC — generated at boot based on the project's detected domain. A cooking app gets chefs and food inspectors. A legal firm gets lawyers and senior partners. A medical system gets doctors and clinical validators.
council-orchestrator init "<objective>" # Start (stage begins at "boot")
council-orchestrator status # Current stage
council-orchestrator advance <stage> # Mark done
council-orchestrator loopback <stage> "reason" # Go back
council-orchestrator next-iteration # New iteration
council-orchestrator models # Discover live model catalog
Before starting a council session, run:
council-orchestrator models
This writes COUNCIL_MODELS.md with all models available via your AI proxy at http://127.0.0.1:4001.
If you have GitHub Copilot connected:
If only OpenCode Zen:
For multi-model orchestration, use the sibling skill ai-council-orchestration which switches models per-stage.
LOOP:
1. Run: council-orchestrator status
2. Match the "Stage:" field:
"boot" → execute Phase 0 — PROJECT BOOT (once, at session start)
"think" → execute Stage 1 — THINK
"plan" → execute Stage 2 — PLAN
"create" → execute Stage 3 — CREATE
"review" → execute Stage 4 — REVIEW & TEST
"verify" → execute Stage 5 — VERIFY & DELIVER
"__delivery_check__"→ run DELIVERY CHECK
"__maxed_out__" → print summary, STOP
3. After handler finishes → IMMEDIATELY GOTO step 1
Announce: ## 🚀 [Phase 0 — BOOT] Scanning project and assembling domain-aware agent team
This phase runs ONCE at session start. It generates the expert team that powers all subsequent stages.
Read these files (in order of priority):
README.md, README.txt, README.rst
package.json, requirements.txt, setup.py, Cargo.toml, go.mod, pom.xml
.env.example (for clues, not secrets)
docs/, CONTRIBUTING.md, ARCHITECTURE.md
First 5 source files in the primary language
Collect:
Based on the scan, classify the domain. Examples (not exhaustive — any domain is valid):
| Domain Signals | Detected Domain |
|---|---|
lodash, react, django, api, webpack, jest | Software / Web Development |
aml, kyc, sanctions, transaction monitoring, compliance | AML / Financial Crime Compliance |
ledger, balance sheet, audit trail, GAAP, IFRS, CA | Accounting / Audit |
plaintiff, defendant, statute, jurisdiction, legal brief | Legal / Law |
patient, clinical, HIPAA, diagnosis, EHR, medical | Healthcare / Medical |
recipe, ingredient, menu, chef, kitchen, cooking | Food & Hospitality |
curriculum, student, lesson, pedagogy, LMS, course | Education / EdTech |
portfolio, derivative, yield, P&L, trading, quant | Finance / Trading |
property, tenant, lease, mortgage, escrow, realtor | Real Estate |
logistics, shipment, warehouse, SKU, inventory, SCM | Supply Chain / Logistics |
If domain is unclear after scan → ask the user once:
"I scanned the project but the domain isn't clear. What industry or field does this project serve? (e.g., software, legal, audit, healthcare, education...)"
For the detected domain, generate domain-appropriate titles for each of the 6 agent roles. The titles must reflect REAL professionals who would work in that domain:
Template for each agent:
## <Role Name>
**Title:** <Real-world professional title in this domain>
**Persona:** You are the <title> for the <project name> project. Your domain expertise is <domain>. Your specific mandate within the council is <mandate>. When evaluating work, you think like a <title> would: <domain-specific perspective>.
**Mandate:** <Specific job in the council>
**Domain perspective:** <How this domain expert evaluates quality>
The 6 mandatory roles (all must be generated):
Head Agent — The lead coordinator. Oversees all agents, routes tasks, accepts/rejects team outputs, makes final calls. Example titles: Chief Technology Officer, Chief Compliance Officer, Managing Partner, Medical Director, Head Chef, Chief Auditor.
Thinker A — Domain strategist. Thinks long-term, sees big picture, proposes architectural/strategic approaches. Example: Senior Architect, Financial Crime Analyst, Senior Counsel, Diagnostic Specialist.
Thinker B — Domain analyst. Investigates specifics, identifies risks and edge cases, challenges assumptions. Example: Risk Intelligence Analyst, Systems Researcher, Legal Researcher, Clinical Analyst.
Executor A — Domain implementor. Builds/creates/executes the primary deliverable. Example: Senior Developer, AML System Specialist, Associate Attorney, Medical Practitioner.
Executor B — Domain support implementor. Assists execution, handles secondary deliverables, documentation. Example: Software Engineer, Compliance Documentation Specialist, Paralegal, Research Nurse.
Critic Agent — External domain challenger. Acts as an outside auditor or skeptical expert. Assumes work is wrong until proven right. Example: Government Compliance Officer, Senior Code Reviewer, External Audit Partner, Peer Reviewer.
Testing Agent — Domain quality verifier. Runs all tests, validates all outputs, finds what doesn't work. Reports directly to Head. Example: QA Engineer, Regulatory Test Lead, Evidence Verifier, Clinical Validator.
# Council Agents
## Project: <project name>
## Domain: <detected domain>
## Generated: <timestamp>
---
## Head Agent
**Title:** <title>
**Persona:** <full persona description>
**Mandate:** Observe all teams. Route tasks. Accept or reject team consensus. Make final calls on loopbacks.
---
## Thinking Team
### Thinker A
**Title:** <title>
**Persona:** <full persona description>
**Mandate:** Propose strategic approaches during THINK and PLAN stages.
### Thinker B
**Title:** <title>
**Persona:** <full persona description>
**Mandate:** Investigate specifics, identify risks, challenge Thinker A's proposals.
---
## Execution Team
### Executor A
**Title:** <title>
**Persona:** <full persona description>
**Mandate:** Primary implementor during CREATE stage.
### Executor B
**Title:** <title>
**Persona:** <full persona description>
**Mandate:** Support implementation, handle documentation, secondary deliverables.
---
## Critic Agent
**Title:** <title>
**Persona:** <full persona description>
**Mandate:** Adversarial review. Assume approach is wrong. Challenge at Stages 1, 2, 3, 4.
---
## Testing Agent
**Title:** <title>
**Persona:** <full persona description>
**Mandate:** Run ALL tests. Find ALL errors. Report to Head. Never suppress failures.
Display generated agent team and active Ponytail mode to user:
🎯 Project: <name> | Domain: <domain>
👑 Head: <title>
🧠 Thinking Team: <Thinker A title> + <Thinker B title>
⚒️ Execution Team: <Executor A title> + <Executor B title>
🔍 Critic: <title>
🧪 Testing Agent: <title>
🦎 Ponytail Mode: <run /ponytail command to check and report current level>
Then: council-orchestrator advance boot "agent team assembled" → GOTO LOOP step 1
IRON LAW: Every agent spawned in Stages 1–5 MUST have their persona from COUNCIL_AGENTS.md injected into their prompt. Generic "Thinker" prompts are forbidden after boot.
Used whenever a team (Thinking Team or Execution Team) must solve a problem together.
ROUND 1 — Independent Proposals:
Agent A: "Here is my proposed approach: <approach>"
Agent B: "Here is my proposed approach: <approach>"
(each independent, no reading the other's yet)
ROUND 2 — Cross-Critique:
Agent A reads B's proposal → "Here is what's wrong with B's approach: <critique>"
Agent B reads A's proposal → "Here is what's wrong with A's approach: <critique>"
(each critiques the other's flaws, risks, gaps)
ROUND 3 — Convergence:
Agent A: "Given B's critique of mine and my critique of B, here is my revised position: <revised>"
Agent B: "Given A's critique of mine and my critique of A, here is my revised position: <revised>"
→ If they agree → consensus reached
→ If still diverging → Head Agent arbitrates: "The correct path is X because Y"
Write TEAM_CONSENSUS.md:
# Team Consensus — <stage> — <topic>
## Team: <Thinking Team / Execution Team>
## Agents: <Thinker A title> + <Thinker B title>
### Round 1 Proposals
**<Agent A>:** <proposal summary>
**<Agent B>:** <proposal summary>
### Round 2 Critiques
**<Agent A> on <Agent B>'s proposal:** <critique>
**<Agent B> on <Agent A>'s proposal:** <critique>
### Round 3 Convergence
**Final Consensus:** <agreed approach>
**Key decision:** <most important choice made>
**Rejected alternatives:** <what was ruled out and why>
Head reads TEAM_CONSENSUS.md:
Testing Agent errors ALWAYS trigger a new team debate cycle with the error report as additional context.
Announce: ## 💭 [Stage 1 — THINK] Thinking Team assembling — reading COUNCIL_AGENTS.md
Load COUNCIL_AGENTS.md first. All agents must be instantiated with their domain personas.
Head Agent assigns the Thinking Team: "Read project files, docs, recent commits. Understand what exists. Also read and load all helper skills (skills/ponytail/SKILL.md, skills/ponytail-review/SKILL.md, skills/ponytail-audit/SKILL.md, skills/ponytail-debt/SKILL.md, skills/ponytail-gain/SKILL.md, skills/ponytail-help/SKILL.md, skills/loop/SKILL.md) to integrate their rules and capabilities into the session context. Prepare independent proposals."
Invoke Team Debate Protocol with topic: "Propose the best architecture for: <objective>"
Each Thinker (using their domain persona) independently proposes 1-2 architectures. Then they critique each other. Then converge.
Cover in debate:
Head Agent stress-tests the Thinking Team consensus:
Write THOUGHT_REPORT.md with: interpretations from domain perspective, constraints, risk analysis, 3+ architectures compared (pros/cons), recommended approach with domain-specific justification.
Spawn Critic Agent (using their domain persona from COUNCIL_AGENTS.md):
"You are the . Assume the Thinking Team's approach is wrong. What domain-specific assumptions could be false? What risks were missed? Is this the strongest approach from a perspective? Produce
CRITIQUE_REPORT.md. If no concerns, state EXACTLY: 'No concerns — approach is sound.'"
council-orchestrator loopback think "<reason>" → new Thinking Team debate with Critic concerns injected → GOTO LOOP step 1council-orchestrator advance think "approved" → GOTO LOOP step 1Announce: ## 📋 [Stage 2 — PLAN] Thinking Team planning, Execution Team validating
Load COUNCIL_AGENTS.md. Use domain personas throughout.
Thinking Team debates file/module structure. Each Thinker proposes a structure → debate → consensus on which files are created/modified. Each file = one clear responsibility. Follow Ponytail rules: map the absolute minimum file structure needed. Avoid speculative helper files, single-implementation interfaces, or config bloat.
After Thinking Team produces task list, invoke Execution Team (not Thinking Team) to review feasibility:
Each task = one action (2-5 minutes):
Task 1: Write failing test
Task 2: Run to confirm failure
Task 3: Implement minimal code
Task 4: Run to confirm pass
Task 5: Commit
Write TASK_EXECUTION_PLAN.md with:
Check:
Spawn Critic (domain persona): "Are any tasks under-specified? Dependencies correct? Risks from Stage 1 covered? Would a approve this plan?"
council-orchestrator loopback plan "<reason>" → GOTO LOOP step 1council-orchestrator advance plan "approved" → GOTO LOOP step 1Announce: ## 🔧 [Stage 3 — CREATE] Execution Team building — Testing Agent standing by
Load COUNCIL_AGENTS.md. Executors build. Testing Agent validates after each task batch.
RED — Write Failing Test First:
- Write ONE test per behavior
- Name clearly describes behavior
- Use real code (no mocks unless unavoidable)
- NO production code without a failing test first
Verify RED — Watch It Fail:
- Run the test
- Confirm it fails (for the RIGHT reason — feature missing, not typo)
- If it passes, you're testing existing behavior → FIX THE TEST
- If it errors, fix the error → re-run until it fails correctly
GREEN — Minimal Implementation:
- Write SIMPLEST code to pass the test, adhering strictly to Ponytail's ladder (YAGNI → stdlib → native → one-line → minimum code)
- No unrequested abstractions, boilerplate, or dependencies
- Don't add what the test doesn't require
Verify GREEN — Watch It Pass:
- Run the test
- Confirm it passes
- Other tests still pass
- Output pristine (no errors/warnings)
- If fails → FIX THE CODE, not the test
REFACTOR — Clean Up (while staying green):
- Remove duplication
- Improve names
- Extract helpers
- Keep tests green
- Don't add behavior
- Mark deliberate simplifications and shortcuts with a `ponytail: <ceiling>, <upgrade path>` comment
IRON LAW: NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST. Write code before test? Delete it. Start over. No exceptions.
For independent tasks, spawn fresh sub-agents with their COUNCIL_AGENTS.md personas. Ensure the Ponytail ruleset and active level are fully loaded in their context:
Agent(persona=<Executor A from COUNCIL_AGENTS.md>, prompt="""
You are the <Executor A title> for the <project name> project.
Implement Task N: <description>
<TDD instructions>
Adhere strictly to the Ponytail ladder: YAGNI → stdlib → native → one-line → minimum code.
Do not introduce unrequested abstractions or dependencies.
Self-review before reporting done.
""")
Each sub-agent gets:
After each task sub-agent completes:
Both must pass before moving to next task.
If tasks have NO shared state or sequential dependencies, dispatch them in parallel:
After Execution Team completes a batch, invoke Testing Agent (domain persona):
Agent(persona=<Testing Agent from COUNCIL_AGENTS.md>, prompt="""
You are the <Testing Agent title>. Run ALL tests, ALL linters, ALL builds.
Report EVERY failure. Do not suppress or minimize.
Produce TEST_RESULTS.md.
""")
If Testing Agent finds errors → Head Agent routes to Execution Team for Debate + Fix → Testing Agent re-runs → repeat until clean.
If you discover a reusable pattern/capability is missing during creation:
Write a brief skill definition: what it is, when to use, core pattern.
Save as skill for future reference.
Do NOT improvise undocumented logic.
If YES → council-orchestrator advance create "all done" → GOTO LOOP step 1
If NO → council-orchestrator loopback create "<reason>" → GOTO LOOP step 1
Announce: ## 🔍 [Stage 4 — REVIEW & TEST] Critic + Testing Agent activated
Load COUNCIL_AGENTS.md. Use domain personas for all agents.
Before reviewing:
BASE_SHA=$(git rev-parse HEAD~1) HEAD_SHA=$(git rev-parse HEAD)Dispatch simultaneously (use their COUNCIL_AGENTS.md personas):
Agent(persona=<Critic from COUNCIL_AGENTS.md>):
"As the <Critic title>: Review for logic errors, domain-specific correctness,
edge cases, security gaps, anti-patterns, maintainability.
Also run a ponytail-review for over-engineering and complexity. Find what to delete/simplify using tags:
delete (dead code/flexibility), stdlib (reinvented stdlib), native (dependency doing what platform does),
yagni (abstraction with 1 implementation), shrink (same logic, fewer lines).
List location, what to cut, and what replaces it. Report the net lines removable.
Would a <Critic title> approve this in production?"
Agent(persona=<Testing Agent from COUNCIL_AGENTS.md>):
"As the <Testing Agent title>: Run ALL tests. Run linters. Run builds.
Run every validation command. Report EVERY failure — do not suppress any.
Produce TEST_RESULTS.md with full output."
Additionally, run the /ponytail-review command (or the ponytail-review skill) directly on the current git diff to harvest a concrete delete-list of over-engineered elements, and append this output to the review feedback.
When Testing Agent reports errors:
When receiving Critic feedback:
Never: performative agreement, blind implementation, batch without testing.
Push back if: suggestion breaks existing functionality, lacks full context, violates YAGNI.
Write REVIEW_ISSUES.md with all findings categorized:
Phase 1 — Root Cause Investigation (BEFORE any fix):
1. Read error messages carefully — stack traces, line numbers
2. Reproduce consistently — exact steps, every time?
3. Check recent changes — git diff, recent commits
4. Trace data flow — where does the bad value originate?
Phase 2 — Pattern Analysis:
1. Find working examples — similar code that works
2. Compare against references — read completely
3. Identify differences — what's different between working and broken?
Phase 3 — Hypothesis and Testing:
1. Form single hypothesis — "I think X is root cause because Y"
2. Test minimally — smallest possible change, one variable at a time
3. Verify before continuing — worked? Yes → fix. No → new hypothesis.
Phase 4 — Implementation:
1. Create failing test case — simplest possible reproduction
2. Implement single fix — ONE change, address root cause
3. Verify fix — test passes, no regressions
4. If 3+ fixes failed → STOP. Question the architecture.
IRON LAW: NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST.
council-orchestrator loopback review "<reason>" → re-run review → GOTO LOOP step 1REVIEW_ISSUES.md has ZERO unresolved issues AND Testing Agent confirms zero test failurescouncil-orchestrator advance review "all clear" → GOTO LOOP step 1
Announce: ## ✅ [Stage 5 — VERIFY & DELIVER] Testing Agent final validation
Load COUNCIL_AGENTS.md. Testing Agent runs final full validation with domain expertise.
IRON LAW: NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE.
For EVERY claim, follow this gate:
1. IDENTIFY — What command proves this claim?
2. RUN — Execute the FULL command (fresh, complete)
3. READ — Full output, check exit code, count failures
4. VERIFY — Does output confirm the claim?
- If NO: State actual status with evidence
- If YES: State claim WITH evidence
Red flags: Using "should", "probably", "seems to" before verification. Expressing satisfaction before verifying. Trusting agent success reports without checking.
- Run the FULL test suite — not just unit tests
- Build the project — confirm compilation
- Check all integration points
- Run any manual verification steps
- Output: full verification log
Agent(persona=<Testing Agent from COUNCIL_AGENTS.md>, prompt="""
You are the <Testing Agent title>. Your job is final validation.
Original objective: <objective>
Completion criteria: <from council_journal.md>
Verify EVERY criterion from a <domain> perspective. Check:
- Is every requirement met?
- Is output complete and self-contained?
- Any domain-specific edge cases or gaps?
- Can the output be used as-is by a <domain> practitioner?
Produce VERIFICATION_SIGN_OFF.md
- If ALL satisfied: "VERIFIED — Ready to deliver"
- If ANY unsatisfied: state each gap explicitly
""")
Before sign-off, run the /ponytail-debt command (or ponytail-debt skill) to scan the codebase for any ponytail: comments, verify their upgrade triggers, and ensure they are captured in a tracked ledger file PONYTAIL-DEBT.md in the project root.
1. Verify tests pass
2. Detect environment (normal repo vs worktree)
3. Determine base branch (main/master)
4. Present options (for user interaction if needed):
- Merge locally
- Push and create PR
- Keep branch as-is
- Discard
Head Agent reads VERIFICATION_SIGN_OFF.md and makes the call:
council-orchestrator advance verify "passed" → GOTO LOOP step 1council-orchestrator loopback verify "<reason>" → GOTO LOOP step 1When council-orchestrator status shows stage: __delivery_check__:
Step 1: Read objective from council_journal.md
Step 2: Read output (all created files, VERIFICATION_SIGN_OFF.md, COUNCIL_AGENTS.md)
Step 3: Compare output to completion criteria
If objective FULLY satisfied:
## 📦 [DELIVERY] Objective satisfied!
## 🎯 Objective: <objective>
## ✅ Iterations: N | Total loops: M
## 📄 Output: <path>
Present final output. STOP THE LOOP.
If NOT fully satisfied:
## 🔄 [LOOP] Iteration N complete — objective not fully satisfied
## 📋 Unsatisfied: <gaps>
## 🚀 Starting Iteration N+1 with accumulated context
council-orchestrator next-iteration
Then GOTO LOOP step 1 — stage is now "think" again with ALL accumulated context.
When context window reaches 140,000 tokens:
council-orchestrator compact/compactcouncil_journal.md AND COUNCIL_AGENTS.mdcouncil-orchestrator statusNever compact mid-sub-agent task — finish the atomic unit first.
| # | Directive | Rule |
|---|---|---|
| 1 | NEVER STOP | No user input needed. Resolve blockers autonomously. Never ask "should I continue?" |
| 2 | GOTO LOOP step 1 | After every stage action, IMMEDIATELY go back to status check |
| 3 | Boot first | ALWAYS run Phase 0 boot first. No stage can start without COUNCIL_AGENTS.md existing. |
| 4 | Domain personas required | Every sub-agent MUST receive their domain persona from COUNCIL_AGENTS.md. Generic prompts forbidden. |
| 5 | Teams debate before reporting | Thinking Team and Execution Team use 3-round debate before producing consensus. |
| 6 | Testing Agent is the gate | Testing Agent errors stop the cycle. No advancing until Testing Agent confirms zero failures. |
| 7 | Head routes, not decides unilaterally | Head accepts/rejects consensus. Head routes errors back to relevant team. |
| 8 | TDD always | NO production code without a failing test first. Write code first? Delete it. |
| 9 | Verify before claiming | NO "it works" without fresh command output. Run the command, read the output. |
| 10 | Root cause before fix | NO fix without investigation first. Symptom fixes are failure. |
| 11 | Never silence Critic | Critic must report at Stages 1, 2, 3, 4. Explicit "no concerns" if none. |
| 12 | Never bundle tasks | Each atomic task gets its own sub-agent. One behavior per test. |
| 13 | Never lose context | Journal is truth. COUNCIL_AGENTS.md is team truth. Both carried through every stage. |
| 14 | Never deliver unverified | Only after Stage 5 sign-off AND delivery check pass. |
| 15 | Dual-test Stage 3 | Run spec review THEN domain quality review. Both must pass. |
| 16 | Create missing capabilities | Don't improvise. Write the pattern as a skill. |
| 17 | Auto-compact at 140K | Run /compact when context ≥ 140K. Re-read COUNCIL_AGENTS.md after compaction. |
| 18 | Safety limit: 50 iterations | Journal preserved if hit. Manual intervention needed. |
| 19 | Deadman switch | 10+ loops on same stage? Radically change approach. |
| 20 | Follow Ponytail rules | Use Ponytail ladder: YAGNI → stdlib → native → one line → minimum. Avoid speculative abstractions/boilerplate. Mark shortcuts with ponytail: comments. |
This file IS the complete superpower library. All 14 patterns + domain agent system are embedded inline:
| Look For | Stage | Pattern Name | IRON LAW |
|---|---|---|---|
| Generating domain agent team | Phase 0 — BOOT | Project Boot + Domain Detection | Boot before EVERYTHING. COUNCIL_AGENTS.md must exist. |
| Team collaboration | All stages | Team Debate Protocol | 3-round debate before consensus. Head accepts/rejects. |
| Exploring ideas, comparing architectures | 1 — THINK | Brainstorming (Thinking Team) | Thinking Team debates architecture. No singleton thinker. |
| Breaking down work into tasks | 2 — PLAN | Writing Plans (Thinking + Execution) | Execution Team validates plan feasibility. |
| Isolating work | 2 — PLAN | Git Worktrees | Work in isolation. No worktree on main branch. |
| Running independent tasks concurrently | 3 — CREATE | Parallel Dispatch (Executor A + B) | Executors run in parallel on independent task sets. |
| Task-by-task execution | 3 — CREATE | Subagent-Driven Dev (with personas) | Executor persona from COUNCIL_AGENTS.md in every sub-agent. |
| Writing code that works | 3 — CREATE | TDD | NO production code without a failing test first. |
| Missing capability during build | 3 — CREATE | Writing Skills | Write the pattern. Don't improvise. |
| Running all tests, finding all errors | 3 & 4 | Testing Agent Gate | Testing Agent: run all, report all, suppress nothing. |
| Reviewing code quality | 4 — REVIEW | Code Review (Critic + Testing Agent) | Domain-aware Critic + Testing Agent run simultaneously. |
| Routing errors to fix teams | 4 — REVIEW | Error Routing (Head → Execution Team) | Errors trigger Execution Team debate → fix → re-test. |
| Responding to feedback | 4 — REVIEW | Receiving Review | Verify before implementing. Push back if wrong. |
| Fixing bugs | 4 — REVIEW | Systematic Debugging | No fixes without root cause investigation first. |
| Confirming fixes | 4 & 5 | Verification Before Completion | No claims without fresh command output. |
| Merging, PR, finishing | 5 — VERIFY | Finishing Branch | Verify tests first. Then present options. |
| Model discovery | Step 0 | Live Model Catalog | Run council-orchestrator models before starting |
council-orchestrator models## 🔵 [Init] Council starting — domain-aware agent teams, all 14 patterns embedded inlinecouncil-orchestrator init "<full objective>" ← starts at stage "boot"council-orchestrator status → stage will be "boot" → execute Phase 0Phase 0 runs first. The council scans the project, detects the domain, generates the expert team, writes COUNCIL_AGENTS.md, then advances to "think". All subsequent stages use the domain-aware personas.
The council is active. The team is being assembled. The loop is turning.
npx claudepluginhub armaan-hub/ai-coincil --plugin ponytailProvides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.