Help us improve
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
By krzemienski
Fixed-Point Deepen architecture of the Anneal plugin family. One plan, heated and cooled repeatedly — inline red team at every depth, Momus 0-100 scoring, convergence by variance/delta/cap.
npx claudepluginhub krzemienski/anneal --plugin anneal-temperEmitter. On EMIT decision, serializes the run into an Opus 4.7 semantic-XML file plus a plan directory with markdown phase files and a depth-history log. The only agent permitted to write outside the plugin's scope.
Orchestrator for the Temper deepen loop. Not a planner. Tracks per-depth scores, invokes convergence-check.py, decides whether to call Prometheus-Temper again or exit the loop. Never writes plans, never reviews, only coordinates.
Functional validator. Builds and exercises the real artifact described in the plan. Captures build output, runtime output, screenshots, API responses, CLI stdout/stderr. Returns PASS or FAIL with evidence. NEVER writes tests, mocks, stubs, or test files.
Pre-plan consultant. Reads the user's task and the probe report and flags ambiguity, unstated requirements, and slop-risk patterns before the planner sees the task. Returns directives for Prometheus-Temper.
Post-plan reviewer. Reads a finished plan and finds every gap. In Temper, emits both a verdict (SAFE/CAUTION/RISKY/BLOCK) AND a numeric score 0-100. Score drives convergence.
Emitter. On EMIT decision, serializes the run into an Opus 4.7 semantic-XML file plus a plan directory with markdown phase files and a depth-history log. The only agent permitted to write outside the plugin's scope. Triggers: stage 7 of every Temper run, only on EMIT rollup. Keywords: atlas, emit, serialize, xml, plan-directory, depth-history.
Orchestrator for the Temper deepen loop. Not a planner. Tracks per-depth scores, invokes convergence-check.py, decides whether to call Prometheus-Temper again or exit the loop. Never writes plans, never reviews, only coordinates. Triggers: invoked once per Temper run at stage 4 to manage the loop. Keywords: orchestrator, deepen-loop, convergence, score-tracking, loop-exit.
Functional validator. Builds and exercises the real artifact described in the plan. Captures build output, runtime output, screenshots, API responses, CLI stdout/stderr. Returns PASS or FAIL with evidence. NEVER writes tests, mocks, stubs, or test files. Triggers: stage 6 of every Temper run. Keywords: hephaestus, functional-validation, build, real-artifact, evidence, no-mocks.
Pre-plan consultant. Reads the user's task and the probe report and flags ambiguity, unstated requirements, and slop-risk patterns before the planner sees the task. Returns directives the planner must follow. On validate re-loop, receives the failure evidence and must emit at least one directive referencing the failure root cause. Triggers: invoked at stage 3 of every Temper run, once. Keywords: metis, pre-plan, ambiguity, directives, slop-risk, clarifying-questions.
Post-plan reviewer. Reads a finished plan and finds every gap. In Temper specifically, emits BOTH a verdict (SAFE/CAUTION/RISKY/BLOCK) AND a numeric score 0-100. Score drives convergence. Triggers: invoked once per depth in the Temper deepen loop, after Red Team Trinity. Keywords: momus, post-plan-review, score, 0-100, rubric, convergence-input.
Modifies files
Hook triggers on file write and edit operations
Uses power tools
Uses Bash, Write, or Edit tools
Share bugs, ideas, or general feedback.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Multi-model consensus engine integrating OpenAI Codex CLI, Gemini CLI, and Claude CLI for collaborative code review and problem-solving.
Ultra-compressed communication mode. Cuts ~75% of tokens while keeping full technical accuracy by speaking like a caveman.
Comprehensive UI/UX design plugin for mobile (iOS, Android, React Native) and web applications with design systems, accessibility, and modern patterns
Memory compression system for Claude Code - persist context across sessions
Curate auto-memory, promote learnings to CLAUDE.md and rules, extract proven patterns into reusable skills.
Standalone image generation plugin using Nano Banana MCP server. Generates and edits images, icons, diagrams, patterns, and visual assets via Gemini image models. No Gemini CLI dependency required.
Shannon Framework v6 — single-plugin consolidation replacing OMC + VF + Crucible + 13 others. 26 commands, 22 skills, 10 agents, 14 hooks across 7 domain modules + 4 enforcement layers.
Deepest-mode planning — consensus + gates + phase hierarchy + multi-plan tournament synthesis. Self-contained. Works with Claude Code and OpenCode.
Sharp-eyed visual-audit suite for Claude Code. Two coupled skills catch real UI defects — contrast failures, false affordances, modal opacity, contract mismatches — via real-system probes, zero mocks, evidence-cited verdicts.
3-agent unanimous consensus validation with hard gates for Claude Code. Maps Lead/Alpha/Bravo roles to CC subagents, enforces unanimity at phase transitions, persists evidence per phase+role.
Evidence-gated task planning, execution, and validation for Claude Code. Refuses completion without quorum-approved proof. No mocks. No stubs. No silent retries past the gate.
Controlled heating, slow cooling, iterative tempering — applied to work plans.
Runtime status (2026-04-22): Cast pipeline verified end-to-end in a real Claude Code worker — all 9 Greek-god agents dispatched, artifact written with exact byte match, XML emitted and passes validate-xml.py. See VERIFICATION-SUMMARY.md for the full trace. Alloy and Temper passed load verification; full E2E runs in progress.
Anneal is a Claude Code plugin family that converts a vague task into a rigorously-reviewed execution artifact: an XML prompt, a plan directory, and the skill enrichment needed to run it. It replaces the earlier deepest-plan prototype (shipped with 91 validator defects due to asymmetric vendoring) with a cleaner core built around three named plan-review archetypes — Metis, Momus, Oracle — and an always-on red team.
The name is literal: the plugin implements simulated annealing against plan-quality scores. Heat (generate candidates), cool (score, prune), temper (red-team critique), repeat until convergence.
Rather than picking one architecture as the default, we ship three complete, installable plugin variants — one per architecture — so you can install all three and compare side-by-side on real tasks.
| Path | What it is |
|---|---|
README.md | You are here. |
ARCHITECTURE-PROPOSALS.md | The architecture document (499 lines, 17 sections). Shared invariants, seven-stage spine, agent roster, three proposals. |
COMPARISON-PLAYBOOK.md | How to test Cast / Alloy / Temper head-to-head. Decision rubric. |
INSTALL.md | Install cheatsheet. Umbrella or per-plugin marketplace. |
diagrams/anneal-architectures.html | Shared visual. Three Mermaid flowcharts with zoom/pan, editorial aesthetic. |
.claude-plugin/marketplace.json | Umbrella dev marketplace listing all three plugins. |
cast/ | Plugin · anneal-cast · Linear single-pour architecture. |
alloy/ | Plugin · anneal-alloy · Tournament consensus architecture. |
temper/ | Plugin · anneal-temper · Fixed-point deepen architecture. |
_shared/ | Reference docs consumed by all three plugins (Opus 4.7 XML schema, agent prompts, plan-reviewer schema, plugin-format cheatsheet). |
scripts/smoke-test.sh | Cross-plugin validation gate. Runs each plugin's validate-plugin.py and reports pass/fail. |
scripts/phase-4-review-prompts.md | Staged reviewer prompts (architect + code-reviewer) for multi-perspective audit. |
Each of the three plugin directories (cast/, alloy/, temper/) is a complete, installable Claude Code plugin:
{architecture}/
├── .claude-plugin/plugin.json # Manifest
├── .claude-plugin/marketplace.json # Per-plugin dev marketplace
├── README.md # Install + usage
├── PRD.md # Architecture-specific product requirements
├── ARCHITECTURE.md # Implementation detail
├── LICENSE # MIT
├── commands/anneal.md # /anneal-{name}:anneal slash command
├── skills/{7-8 skills}/SKILL.md # Metis, Prometheus variant, Momus, Red-Team Trinity, Oracle, Hephaestus, Atlas
├── agents/{9 agents}.md # Agent definitions with model assignments
├── hooks/hooks.json # SessionStart: plugin-loaded marker
├── scripts/validate-plugin.py # Self-validation
├── scripts/orchestrate.sh # Pipeline implementation
├── diagrams/{name}-architecture.html # Architecture-specific visual
└── docs/ # Invariants, worked example, emission format
Install all three, restart Claude Code, and all three commands register:
# Add the umbrella marketplace
/plugin marketplace add /Users/nick/Desktop/anneal
# Install all three
/plugin install anneal-cast@anneal-umbrella-dev
/plugin install anneal-alloy@anneal-umbrella-dev
/plugin install anneal-temper@anneal-umbrella-dev
After restart, three slash commands are available:
/anneal-cast:anneal <task> # Linear · ~8 spawns · ~4 min
/anneal-alloy:anneal <task> # Tournament · ~18 spawns · ~6 min · default --versions 5
/anneal-temper:anneal <task> # Fixed-point deepen · ~8×depth spawns · ~7 min · default --depth 3
Full install options, debugging, uninstall: see INSTALL.md.
All three satisfy the same eight invariants (red team always, validate always, XML + plan output, skill enrichment, unbounded re-loop, parallelization, category routing, dual-family prompts). They differ only in how stage 4 — Plan — works.