Help us improve
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
By krzemienski
Tournament Consensus architecture of the Anneal plugin family. N parallel planner biases compete; a synthesizer blends their best material into one plan. Always-on red team, always-on functional validation.
npx claudepluginhub krzemienski/anneal --plugin anneal-alloyEmitter. Assembles the final artifact when the rollup is EMIT — Opus 4.7 semantic-XML prompt + plan directory. Writes outputs to ${ANNEAL_RUNS_ROOT:-./.anneal/runs}/{run_id}/. Never edits plans. Never re-reviews. Serializes. Invoked at stage 7 of every anneal-alloy run on SAFE or CAUTION rollup.
Functional validator. Builds and exercises the real artifact the plan describes. No mocks, no test files, no stubs. Captures build output, runtime evidence, console logs with timestamps. Returns PASS or FAIL with cited evidence. Iron Rule — if the real system does not work, FIX THE REAL SYSTEM; never modify the plan to make verdict PASS. Invoked at stage 6 of every anneal-alloy run.
Pre-plan consultant. Reads the user's task + probe report and flags ambiguity, unstated requirements, and slop-risk patterns before any planner sees the task. Returns directives for the planner and clarifying questions only on BLOCK. Invoked at stage 3 of every anneal-alloy run.
Post-plan reviewer. Audits the synthesized blended plan produced by the Synthesizer. Finds every gap — missing, ambiguous, assumed, off-happy-path. Returns a ruthless per-phase audit envelope. In Alloy, audits the BLEND, not individual variants. Invoked at stage 4 close-out of every anneal-alloy run.
Architecture synthesizer. Reads every reviewer envelope (Metis, Momus, Red-Team Trinity × 3) and emits a single bird's-eye verdict with release coherence, deployment risk, breaking changes, monitoring recommendations. Final gate before Validate. Invoked at stage 5 close-out of every anneal-alloy run.
Emitter for anneal-alloy runs. Assembles the final Opus 4.7 semantic-XML prompt plus plan directory at ${ANNEAL_RUNS_ROOT:-./.anneal/runs}/{run_id}/ when the rollup is EMIT. The only agent permitted to write outside the plugin's scoped run directory. Never edits plans, never re-reviews, only serializes. Triggers: invoke at stage 7 of every anneal-alloy run on SAFE or CAUTION rollup, after Hephaestus returns PASS.
Functional validator — builds the real artifact the approved anneal-alloy plan describes, runs it in a scratch worktree, captures build output plus runtime evidence plus timestamped console logs, and returns PASS or FAIL with cited evidence. Never writes mocks, stubs, test doubles, or test files. Iron Rule: if the real system does not work, FIX THE REAL SYSTEM — never modify the plan to make the verdict PASS. The only agent permitted to touch a real filesystem outside the plugin's scratch. Triggers: invoke at stage 6 of every anneal-alloy run, only after Oracle returns SAFE or CAUTION.
Pre-plan consultant for anneal-alloy. Reads the user's task and the probe report at stage 3 and emits directives for the planners plus findings on ambiguity, unstated requirements, and slop-risk patterns before any Prometheus-Alloy variant is spawned. Returns clarifying questions only on BLOCK. Load-bearing for Alloy: one ambiguous task produces N useless variants and a Synthesizer blending garbage. Triggers: invoke at stage 3 of every anneal-alloy run, and also on every re-loop after a Hephaestus FAIL — re-loop routes to Metis (not Synthesizer) so the next run's planners are rebiased at the root with sharper directives.
Post-plan reviewer. Reads the synthesized blended plan produced by the Synthesizer and finds every gap — what's missing, what's ambiguous, what's a load-bearing assumption, what breaks off the happy path. Returns a ruthless audit envelope with per-phase findings. In Alloy, Momus audits the BLEND, not individual variants. Triggers: invoked at stage 4 close-out of every anneal-alloy run, after the Synthesizer completes.
Architecture synthesizer — emits the final bird's-eye verdict before Hephaestus. Reads every prior reviewer envelope (Metis, Momus, Red-Team Trinity × 3) plus synthesis-provenance.md and returns one SAFE/CAUTION/RISKY/BLOCK verdict with release coherence, deployment risk, breaking changes, and monitoring recommendations. Last reviewer in the pipeline — cannot downgrade any prior verdict. Triggers: invoke at stage 5 close-out of every anneal-alloy run, after all Red-Team Trinity envelopes persist.
Uses power tools
Uses Bash, Write, or Edit tools
Share bugs, ideas, or general feedback.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Multi-model consensus engine integrating OpenAI Codex CLI, Gemini CLI, and Claude CLI for collaborative code review and problem-solving.
Ultra-compressed communication mode. Cuts ~75% of tokens while keeping full technical accuracy by speaking like a caveman.
Comprehensive UI/UX design plugin for mobile (iOS, Android, React Native) and web applications with design systems, accessibility, and modern patterns
Memory compression system for Claude Code - persist context across sessions
Curate auto-memory, promote learnings to CLAUDE.md and rules, extract proven patterns into reusable skills.
Standalone image generation plugin using Nano Banana MCP server. Generates and edits images, icons, diagrams, patterns, and visual assets via Gemini image models. No Gemini CLI dependency required.
Shannon Framework v6 — single-plugin consolidation replacing OMC + VF + Crucible + 13 others. 26 commands, 22 skills, 10 agents, 14 hooks across 7 domain modules + 4 enforcement layers.
Sharp-eyed visual-audit suite for Claude Code. Two coupled skills catch real UI defects — contrast failures, false affordances, modal opacity, contract mismatches — via real-system probes, zero mocks, evidence-cited verdicts.
Deepest-mode planning — consensus + gates + phase hierarchy + multi-plan tournament synthesis. Self-contained. Works with Claude Code and OpenCode.
Fixed-Point Deepen architecture of the Anneal plugin family. One plan, heated and cooled repeatedly — inline red team at every depth, Momus 0-100 scoring, convergence by variance/delta/cap.
3-agent unanimous consensus validation with hard gates for Claude Code. Maps Lead/Alpha/Bravo roles to CC subagents, enforces unanimity at phase transitions, persists evidence per phase+role.
Controlled heating, slow cooling, iterative tempering — applied to work plans.
Runtime status (2026-04-22): Cast pipeline verified end-to-end in a real Claude Code worker — all 9 Greek-god agents dispatched, artifact written with exact byte match, XML emitted and passes validate-xml.py. See VERIFICATION-SUMMARY.md for the full trace. Alloy and Temper passed load verification; full E2E runs in progress.
Anneal is a Claude Code plugin family that converts a vague task into a rigorously-reviewed execution artifact: an XML prompt, a plan directory, and the skill enrichment needed to run it. It replaces the earlier deepest-plan prototype (shipped with 91 validator defects due to asymmetric vendoring) with a cleaner core built around three named plan-review archetypes — Metis, Momus, Oracle — and an always-on red team.
The name is literal: the plugin implements simulated annealing against plan-quality scores. Heat (generate candidates), cool (score, prune), temper (red-team critique), repeat until convergence.
Rather than picking one architecture as the default, we ship three complete, installable plugin variants — one per architecture — so you can install all three and compare side-by-side on real tasks.
| Path | What it is |
|---|---|
README.md | You are here. |
ARCHITECTURE-PROPOSALS.md | The architecture document (499 lines, 17 sections). Shared invariants, seven-stage spine, agent roster, three proposals. |
COMPARISON-PLAYBOOK.md | How to test Cast / Alloy / Temper head-to-head. Decision rubric. |
INSTALL.md | Install cheatsheet. Umbrella or per-plugin marketplace. |
diagrams/anneal-architectures.html | Shared visual. Three Mermaid flowcharts with zoom/pan, editorial aesthetic. |
.claude-plugin/marketplace.json | Umbrella dev marketplace listing all three plugins. |
cast/ | Plugin · anneal-cast · Linear single-pour architecture. |
alloy/ | Plugin · anneal-alloy · Tournament consensus architecture. |
temper/ | Plugin · anneal-temper · Fixed-point deepen architecture. |
_shared/ | Reference docs consumed by all three plugins (Opus 4.7 XML schema, agent prompts, plan-reviewer schema, plugin-format cheatsheet). |
scripts/smoke-test.sh | Cross-plugin validation gate. Runs each plugin's validate-plugin.py and reports pass/fail. |
scripts/phase-4-review-prompts.md | Staged reviewer prompts (architect + code-reviewer) for multi-perspective audit. |
Each of the three plugin directories (cast/, alloy/, temper/) is a complete, installable Claude Code plugin:
{architecture}/
├── .claude-plugin/plugin.json # Manifest
├── .claude-plugin/marketplace.json # Per-plugin dev marketplace
├── README.md # Install + usage
├── PRD.md # Architecture-specific product requirements
├── ARCHITECTURE.md # Implementation detail
├── LICENSE # MIT
├── commands/anneal.md # /anneal-{name}:anneal slash command
├── skills/{7-8 skills}/SKILL.md # Metis, Prometheus variant, Momus, Red-Team Trinity, Oracle, Hephaestus, Atlas
├── agents/{9 agents}.md # Agent definitions with model assignments
├── hooks/hooks.json # SessionStart: plugin-loaded marker
├── scripts/validate-plugin.py # Self-validation
├── scripts/orchestrate.sh # Pipeline implementation
├── diagrams/{name}-architecture.html # Architecture-specific visual
└── docs/ # Invariants, worked example, emission format
Install all three, restart Claude Code, and all three commands register:
# Add the umbrella marketplace
/plugin marketplace add /Users/nick/Desktop/anneal
# Install all three
/plugin install anneal-cast@anneal-umbrella-dev
/plugin install anneal-alloy@anneal-umbrella-dev
/plugin install anneal-temper@anneal-umbrella-dev
After restart, three slash commands are available:
/anneal-cast:anneal <task> # Linear · ~8 spawns · ~4 min
/anneal-alloy:anneal <task> # Tournament · ~18 spawns · ~6 min · default --versions 5
/anneal-temper:anneal <task> # Fixed-point deepen · ~8×depth spawns · ~7 min · default --depth 3
Full install options, debugging, uninstall: see INSTALL.md.
All three satisfy the same eight invariants (red team always, validate always, XML + plan output, skill enrichment, unbounded re-loop, parallelization, category routing, dual-family prompts). They differ only in how stage 4 — Plan — works.