From beast-forge
Runs multi-stage planning pipeline with verification gates and persistent Ralph state for complex tasks spanning 3+ files or unclear scope.
npx claudepluginhub malakhov-dmitrii/forgeThis skill uses the workspace's default tool permissions.
Two machines: **Plan Forge** (refine until bulletproof) + **Verification Chain** (prove it's actually done). Ralph provides persistence.
Orchestrates 4-phase execution loop (IMPLEMENT, VALIDATE, ADVERSARIAL REVIEW, COMMIT) for complex work units with specs. Verifies outputs adversarially in multi-agent setups.
Orchestrates adversarial plan-implement-review pipeline by spawning role-specific agents with separate contexts. Run after /brainstorm, /repo-eval, /repo-health, or /doc-health.
Share bugs, ideas, or general feedback.
Two machines: Plan Forge (refine until bulletproof) + Verification Chain (prove it's actually done). Ralph provides persistence.
Quick fix (<3 files, clear) → skip forge, direct executor + MICRO-VERIFY
Standard (3-10 files) → PLAN FORGE → EXECUTE → VERIFICATION CHAIN
Complex (10+, architectural) → PLAN FORGE --full → EXECUTE with checkpoints → CHAIN
Vague input → PLAN FORGE --discuss → rest as standard
Use mode: "ralph" with session_id. Phase prefix bf- distinguishes from plain ralph.
{"mode":"ralph","session_id":"<session>","current_phase":"bf-forge|bf-execute|bf-verify",
"state":{"planning_mode":"forge","forge_iteration":1,
"gates":{"skeptic":null,"integration":null,"second_opinion":null},
"verify":{"evidence":null,"auditor":null},
"completed_steps":[],"slug":"task-name"}}
Forge writes status to two channels on every phase transition:
state_write)Shows ralph:N/M in the terminal statusbar. N = current forge iteration, M = max (default 5).
Call state_write at each phase transition:
state_write(
mode: "ralph",
active: true,
iteration: <forge_iteration>,
max_iterations: <total_phases>,
current_phase: "bf-<phase>",
task_description: "<slug>",
session_id: <session_id>,
state: {
planning_mode: "forge",
gates: { skeptic: "PASS"|"FAIL"|null, integration: ..., second_opinion: ... },
steps_done: N,
steps_total: M
}
)
notepad_write_priority)Shows detailed progress in notepad (visible via notepad_read). Update at each transition:
notepad_write_priority(
content: "[FORGE] <PHASE> v<iter> | <gate_summary> | <steps_done>/<steps_total> steps"
)
Examples at each stage:
[FORGE] PRECEDENT | — | 0/8 stages
[FORGE] RESEARCH | — | 1/8 stages
[FORGE] CHALLENGE | — | 2/8 stages
[FORGE] PLAN v1 | — | 3/8 stages
[FORGE] REVIEW v1 | Skeptic:PASS Integration:— 2nd:— | 3/8 stages
[FORGE] REVIEW v1 | Skeptic:PASS Integration:PASS 2nd:FAIL | 3/8 stages
[FORGE] PLAN v2 | ↑ revising: 2nd opinion found P1 issue | 3/8 stages
[FORGE] REVIEW v2 | Skeptic:PASS Integration:PASS 2nd:PASS | 3/8 stages
[FORGE] EXECUTE | ALL GATES PASSED | 5/7 steps done | 5/8 stages
[FORGE] VERIFY | Evidence:— Auditor:— | 6/8 stages
[FORGE] VERIFY | Evidence:PASS Auditor:PASS | 7/8 stages
[FORGE] DOCS-REFRESH | VERIFIED | 8/8 stages
| # | Phase | Description |
|---|---|---|
| 1 | PRECEDENT | Search institutional knowledge |
| 2 | RESEARCH | Read files, spike assumptions (top-3 mandatory) |
| 3 | CHALLENGE | Independent review of approach |
| 4 | CLARIFY | User questions + visionary mode selection |
| 5 | PLAN-DRAFT | Write plan with typed claims |
| 6 | VISIONARY | N parallel passes (optional) |
| 7 | COMPARATOR | 3-tier classify + reality-check (if visionary ran) |
| 8 | USER-DECIDE | Pick standard / visionary / merge |
| 9 | PLAN-FINAL | Rewrite plan per decision |
| 10 | REVIEW | Sealed blind parallel + stacked meta |
| 11 | EXECUTE | Cascade gemini->opus, TDD |
| 12 | VERIFY | Evidence + Auditor |
| 13 | DOCS-REFRESH | Docs + DB sweep |
For the statusline max_iterations: use forge iteration cap (default 5) during PLAN/REVIEW loop. Switch to 13 (total stages) during EXECUTE onward.
state_write(mode: "ralph", active: false, completed_at: <ISO>)
notepad_write_priority(content: "[FORGE] DONE ✓ | <slug> | <total_time>")
mempalace_diary_write(
agent_name: "claude-code",
topic: "forge-<slug>",
entry: "FORGE:<slug>|iters:N|gates:PASS|spikes:M|drawers:K|lesson:<AAAK summary>"
)
On failure:
notepad_write_priority(content: "[FORGE] BLOCKED | <slug> | <reason>")
Non-blocking diary guarantee (D): mempalace_diary_write failure (timeout, MCP error, palace unavailable) MUST NOT block forge completion. Per the A runtime degradation contract: catch the failure, log [palace-degraded] diary_write failed: <reason> to notepad, mark forge completed in forge.db regardless. Forge completion is the source of truth; the diary entry is a secondary write.
Refinement loop. Two phases: draft the plan, then gate it. Cycles until all 3 gates pass.
PRECEDENT → RESEARCH (+top-3 mandatory spikes) → CHALLENGE → CLARIFY (+visionary?) →
PLAN-DRAFT (typed claims) → [VISIONARY STREAM] → [COMPARATOR] → USER DECISION →
PLAN-FINAL → REVIEW (blind parallel + stacked meta) →
EXECUTE (cascade: gemini→opus) → VERIFY → DOCS-REFRESH (+DB sweep)
Before touching code, search what the project already knows:
.omc/plans/ (if exists) for past plans touching same systems. Note approach + outcome.docs/architecture/{system}.md (if exists) to understand current design.## Common Failures section (if exists) for known failure patterns..omc/forge.db (if exists) — past forge runs, risk scores, cached spikes:
-- Past runs on these systems
SELECT slug, status, iteration, json_extract(context, '$.lesson') as lesson
FROM forges WHERE systems LIKE '%<system>%' AND status IN ('completed','abandoned')
ORDER BY completed_at DESC LIMIT 5;
-- Risk scores
SELECT system, fail_rate, avg_iterations FROM system_risk
WHERE system IN ('<system1>', '<system2>');
-- Cached spikes (don't re-test confirmed assumptions)
SELECT assumption, result, actual FROM spikes
WHERE permanent = 1 OR tested_at > datetime('now', '-30 days');
-- Co-failure warnings
SELECT system_a, system_b, gate,
ROUND(1.0 * fail_count / total_count, 2) as fail_rate
FROM co_failures
WHERE (system_a IN ('<systems>') OR system_b IN ('<systems>'))
AND CAST(fail_count AS REAL) / total_count > 0.3;
Surface: "reply-worker + engage-scheduler fail Integration 60% — include both in scope?"ctx_batch_execute to keep research results in sandbox.mempalace_search(query="<touched-systems keywords>", wing="<project slug>", limit=5) — semantic recall across code + docs + convosmempalace_kg_query(entity="<system>") — one call per touched system for typed facts (deps, status, models, incidents)mempalace_diary_read(agent_name="claude-code", last_n=3) — recent agent diary entries for cross-session context
Summarize results into a ## Prior institutional context section that is prepended to PLAN-DRAFT. Runtime degradation contract: each palace call has a 10s soft timeout. On timeout, error, or MCP-unavailable, log one line via notepad_write_priority prefixed [palace-degraded] <call> failed: <reason>, skip the failing call, and continue PRECEDENT with file-based sources only. NEVER block or retry-loop on a palace call — forge must remain forward-progressable when MCP is flaky. The same contract applies to every palace call mentioned elsewhere in this skill (Skeptic kg_query, draft commits, completion diary_write, spike mirror). If the --no-mempalace flag is set, skip steps 7 entirely.scc --by-file <dirs> (if installed) — flag high-complexity files for careful reading.✅ confirmed: [assumption] or ❌ refuted — actual: [reality]Self-challenge first: What am I assuming? What's the simplest approach? What breaks?
Then get an independent challenge on the APPROACH (not the plan yet — that comes at REVIEW):
which codexcodex exec "Review this approach. What are the 3 most likely failure modes?" -C <repo> -s read-only -c 'model_reasoning_effort="high"'Agent(model="opus", prompt="You are a senior engineer who has NEVER seen this codebase. Here is the proposed approach: [approach]. Find the 3 most likely failure modes.")If challenge reveals a fundamental flaw → revise approach before writing the plan.
Adaptive default based on scope:
Visionary stream — search for a "significantly better" approach?
a) skip ← default if <3 files
b) 1 pass ← default if 3-10 files
c) 3 passes ← default if 10+ files
d) N passes (any number, no token limit)
Save choice to forges.context as visionary_mode. If --no-visionary flag, auto-select a.
Before invoking the planner agent, select it inline from the forge's flag.
v3 is now the default — only explicit pipelineV3 === false opts back to v2:
const ctx = getForgeContext(cwd, forgeId);
Agent(subagent_type=(ctx?.pipelineV3 === false ? 'planner' : 'planner-v3'), ...)
planner-v3 emits DAG JSON + parallel claim fan-out per ADR §3.
planner is the explicit opt-out path (setForgeContext(forgeId, 'pipelineV3', false)).
Each step MUST contain a Claims: block classifying every assertion:
fact: — verifiable codebase statement
Required: file:line OR doc#section OR spike:id citation.
Skeptic verifies citation exists AND contains the claimed thing.
design_bet: — reasoned guess about behavior
Required: assumption: + validation_plan: + blast_radius:
strategic: — direction-level choice
Required: rationale: + alternatives_considered:
kg_fact: — assertion about project-level truth stored in the MemPalace knowledge graph
Required: kg_citation: <triple_id> discovered via mempalace_kg_query.
Use kg_fact: (not fact:) when the assertion is about project-level infrastructure/config/client status/architectural invariants that live outside the code — models a service uses, who owns a project, current tenant status, browser stack, patterns like "reply_queue uses SKIP_LOCKED". Use fact: for code-level claims with file:line citations. See agents/planner-v3.md "### kg_fact claims" for the full decision rule and examples.
NO global "% hard evidence" gate. Skeptic checks PER CLAIM:
current=false → mirage (see skeptic.md pattern 11)
Plan with ANY mirage → FAIL Skeptic gate.Refuted kg_fact feedback loop: when Skeptic verdicts a kg_fact: claim as MIRAGE, the orchestrator enqueues a palace_drafts row with draft_type='kg_invalidate', payload:{triple_id} so the stale KG fact can be cleaned up in Machine 3. Subject to the A runtime degradation contract (skip enqueue if MCP flaky).
Each step also gets complexity: simple|complex:
complex if step has strategic: claims OR touches 3+ filesStep format:
### Step N: [Title]
Do: [specific action]
Files: [exact paths]
Complexity: simple|complex
Claims:
- fact: [claim] — file:line
- design_bet: [claim]
assumption: ...
validation_plan: ...
blast_radius: ...
Acceptance criteria:
- static: [type check, lsp, semgrep]
- unit: [test command + expected]
- e2e: [command proving feature works] (where applicable)
Failure mode: [what goes wrong]
Fallback: [what to do]
Checkpoint: true/false
Depends on: [step numbers]
Plan MUST include:
Read the project's CLAUDE.md for test/build/deploy commands. Don't invent — use what the project already has. E2E criteria use REAL commands the project actually uses (curl APIs, DB queries, CLI tools). NEVER wait for cron.
If user chose visionary mode in CLARIFY (b, c, or d), run N parallel passes:
| Pass | Angle | Agent | Reads | Question |
|---|---|---|---|---|
| 1 | simpler | opus or codex | PLAN-DRAFT only | "Is there a simpler approach?" |
| 2 | better | codex preferred, opus fallback | PLAN-DRAFT only | "Could this be 10x better?" |
| 3 | blind_spots | opus | ORIGINAL USER REQUEST only | "What is the user really trying to achieve?" |
Agent prompt files at ~/.claude/plugins/forge/agents/visionary-simpler.md, visionary-better.md, visionary-blind-spots.md.
Each pass:
recordVisionaryPass(cwd, forgeId, iteration, { passNumber, angle, agent, content })Passes run in parallel when possible (spawn multiple Agent() calls in one message).
Fresh opus agent reads: PLAN-DRAFT + all visionary passes + original user request.
Agent prompt at ~/.claude/plugins/forge/agents/comparator.md.
3-tier claim classification:
codebase-verifiable → grep/read codebase → verified | rejectedexternally-verifiable → WebSearch/context7 if available, else needs-external-checkstrategic → present to user with tradeoffs, no verdictOutput to user (before REVIEW):
TL;DR:
Standard: [1-line]
Visionary: [1-line]
Main diff: [1-line]
DIFF ITEMS:
- [item]: gain=[...], cost=[...], tier=[...], check=[✓/✗/unknown]
REALITY CHECK:
Confirmed: [...] Rejected: [...] Unknown: [...]
RECOMMENDATION: standard | visionary | merge
User picks: standard | visionary | merge | edit manually.
Record: recordComparatorReport(cwd, forgeId, iteration, { tldr, diffItems, realityCheck, recommendation })
After user decides: updateComparatorDecision(cwd, forgeId, iteration, decision)
PLAN-FINAL is written (or rewritten) based on user's choice, then proceeds to REVIEW.
Gates do NOT read .omc/forge.db. They do NOT look for other gate findings. Isolation is enforced by:
gates table (with blind=1) via INSERT OR REPLACE after both returngates with blind=0, meta_findings populatedSkeptic (blind, opus, fresh):
fact: claim citation (file exists, line contains claimed thing)design_bet: has assumption + validation_plan + blast_radiusstrategic: has rationale + alternatives_consideredkg_fact: against the ### KG Snapshot section included in the sealed input bundle (orchestrator pre-fetches mempalace_kg_query(entity=<subject>) for each plan subject BEFORE spawning Skeptic, then includes the JSON snapshot in Skeptic's prompt). Skeptic matches kg_citation by (predicate, object, current=true) against the snapshot. MCP tools are NOT reachable from subagent frontmatter — inline-snapshot is the only reliable path. (Pattern 11 in agents/skeptic.md)Integration (blind, sonnet, fresh):
Second Opinion (stacked, codex preferred / opus fallback):
Record each gate: recordGate(cwd, forgeId, iteration, gate, result, findings, { blind, inputsSeen, metaFindings })
ANY gate FAIL → revise plan, re-run failed gate(s) only, bump iteration.
For typed claims, also record: recordClaim(cwd, forgeId, iteration, stepNumber, { claimType, claimText, citation }) for each claim in the plan, then validateClaim(cwd, claimId, { result, notes }) as Skeptic validates them.
FINAL-PLAN saved to .omc/plans/FINAL-PLAN-{slug}.md (or project root if .omc doesn't exist). Present approval summary including all gate results. User reviews → "ok" or corrects → proceed to EXECUTE.
v3 is the default execute path. Only explicit pipelineV3 === false opts back to the v2 linear cascade:
const ctx = getForgeContext(cwd, forgeId);
if (ctx?.pipelineV3 === false) {
// v2 linear cascade (legacy opt-out path)
} else {
await fanOutStreams(cwd, forgeId, /* task spawner */);
}
Per step in PLAN-FINAL:
IF step.complexity == 'simple' AND `which gemini` available:
1. gemini exec (model: gemini-2.5-flash-preview, timeout: 90s)
2. Static analysis (tsc/lsp/semgrep)
FAIL → git checkout changed files, opus rewrites
3. Step acceptance criteria (unit tests)
FAIL → git checkout changed files, opus rewrites
4. opus REVIEW pass (read git diff only):
APPROVE → next step
REJECT → git checkout changed files, opus rewrites
ELSE:
opus executes directly (standard path)
Token savings: ~70% per simple step (gemini ~5k + opus review ~3k vs opus-only ~25k).
Per step: RED (write failing test from acceptance criteria) → GREEN (minimal code to pass) → REFACTOR.
Parallel where steps are independent. Serial where dependent.
Checkpoint steps (checkpoint: true): run acceptance criteria before proceeding.
Static analysis on every change.
After all steps → VERIFICATION CHAIN.
Run the project's static analysis tools. Common:
tsc --noEmit # TypeScript
lsp_diagnostics <files> # IDE diagnostics
semgrep scan --config=.semgrep/ # Custom gotcha rules (if .semgrep/ exists)
Run project's test command on changed modules. If no tests exist for changed code → flag as GAP.
Where the plan has e2e acceptance criteria, run them. ACTIVELY trigger — never wait for cron or scheduled runs. Use whatever mechanism the project has.
Evidence Collector (fresh sonnet agent, agents/evidence-collector.md):
Auditor (fresh opus agent, agents/auditor.md):
GAPS FOUND →
1. Check CLAUDE.md "Common Failures" — new pattern? Add it.
2. Back to EXECUTE with specific gaps list.
3. Ralph iteration++.
VERIFIED →
E8: Docs Refresh (final stage — see below).
Done.
After verification passes, run a scoped documentation hygiene pass. This is the /docs-refresh skill integrated as Forge's closing stage.
CLAUDE.md gotchas — did this work fix a gotcha? Remove it. Did it introduce a new danger? Add it. Still under 40-line target?
Memory files — do any arch-* or lesson-* files reference systems that changed? Update or flag.
MEMORY.md index — still under 180 lines? Any new entries needed? Any entries now stale?
docs/ vault — do architecture docs, specs, or runbooks need updating for what changed?
Common Failures — new pattern discovered during execution? Add it.
Knowledge Sweep (forge.db + global.db)
Auto-action (safe):
current_state pointers (forge abandoned/completed) → DELETEforges.plan_path where file doesn't exist → SET NULL + flagWarn-only (user decides):
~/.forge/global.db spikes >90 days without verification → "re-verify?"Quick fix (<3 files) → CLAUDE.md gotcha check only (30 seconds)
Standard (3-10 files) → CLAUDE.md + touched memory files + docs/ for affected systems
Complex (10+) → full /docs-refresh --scan-only, present triage to user
Machine 3 doesn't just audit — it GENERATES docs from forge results:
permanent=1 has no corresponding lesson-*.md, auto-create one in memory with the assumption, actual result, and "How to apply" section.SELECT findings FROM gates WHERE result='FAIL' GROUP BY findings HAVING COUNT(*) >= 3. If a finding repeats across forges, add it to CLAUDE.md gotchas section.docs/architecture/{system}.md or create if missing.--complete, if a lesson was recorded, check if it's already in memory. If not, auto-create lesson-{slug}.md.kg_add, architecture changes as drawers, refuted kg_fact as kg_invalidate) live in the palace_drafts table via addPalaceDraft. At this stage, call listPalaceDrafts(cwd, forgeId, 'pending') and present a single batch triage to the user: group by draft_type, show count + preview of each payload, ask approve all | reject all | edit. On approve, iterate drafts: call the matching MCP tool (mempalace_kg_add / mempalace_add_drawer / mempalace_kg_invalidate) per payload; on success call markPalaceDraftCommitted(cwd, draftId). On reject, call discardPalaceDraft(cwd, draftId) for each. Partial MCP failure leaves remaining rows pending — user can re-run Machine 3 to retry. All MCP calls subject to the A runtime degradation contract.Resume-surface contract: /forge --resume <slug> MUST run listPalaceDrafts(cwd, forgeId, 'pending') and surface the count to the user BEFORE continuing work on the resumed forge. If count > 0, prompt: "N pending palace drafts from last run — triage now? (y/N)". On y → run the batch-triage flow above; on N → note [palace-drafts-deferred] N rows to notepad and continue resume. Without this, drafts accumulate silently across park/resume cycles.
/docs-refresh --scan-only and include the triage in the completion report./docs-refresh standalone run.After ANY direct executor completes (non-forge work):
- Static analysis on changed files (whatever project has)
- Reference check on changed exports (callers updated?)
- Run tests on changed modules (if fast, <30s)
Verdict: CLEAN | SUSPECT [list]
These rules prevent the self-optimizing system from removing its own safety checks.
fact: → strategic: to dodge citation = mirage. Skeptic catches by checking claim shape.Note on MemPalace integration (2026-04-14): All palace recall, kg_fact: validation, palace_drafts write-through, completion diary, and spike mirror are fully additive. Rule #7 is preserved because kg_fact: has structural parity with fact: — kg_citation: <triple_id> is a required citation just like file:line, and Skeptic validates triple existence + current=true via mempalace_kg_query. Rule #5 is preserved because Skeptic reads KG as an additional data source, not as another gate's findings. All palace interactions honor the runtime degradation contract (A) and are disabled wholesale by --no-mempalace.
Forge manages persistent work units ("forges") that survive across sessions.
/forge "task" — create new forge, start pipeline
/forge --park [reason] — park current forge, save context to forge.db
/forge --resume [slug] — resume parked forge, restore context
/forge --spawn "sub-task" — create child forge (optionally --blocks-parent)
/forge --switch [slug] — park current + resume another (atomic)
/forge --list — show all forges with status
/forge --status — deep status of current forge + dependency tree
/forge --complete — mark done, record lesson, unblock dependents
/forge --abandon [reason] — mark abandoned, preserve context
/forge --full "task" — extended RESEARCH + mandatory spike
/forge --discuss "task" — extended CLARIFY for vague input
/forge --plan-only — stop after FINAL-PLAN
/forge --execute — load existing FINAL-PLAN, skip forge
/forge --no-docs — skip docs-refresh final stage
/forge --no-visionary "task" — skip visionary stream (equivalent to 'skip' in CLARIFY)
/forge --no-mempalace — skip all MemPalace integration (palace recall in PRECEDENT, kg_fact validation, palace_drafts write-through, completion diary, spike mirror). Use when palace MCP is unavailable or for hermetic runs.
All forge state lives in .omc/forge.db (SQLite). This is the sole source of truth.
ralph-state.json is a HUD display cache only — OMC may delete it on SessionEnd.
On every phase transition:
current_state table tracks the active forge for compaction recoveryIf /compact happens mid-forge, the PreCompact hook saves state to forge.db and injects a recovery prompt. After compaction, read forge.db current_state and resume:
SELECT value FROM current_state WHERE key = 'active_forge';
After each gate completes, record to forge.db:
INSERT INTO gates (forge_id, iteration, gate, result, findings)
VALUES (<id>, <iter>, 'skeptic', 'PASS', '["no mirages found"]');
After each spike:
INSERT INTO spikes (forge_id, assumption, result, actual, permanent)
VALUES (<id>, 'Dolphin supports concurrent tabs', 'refuted', '1 tab per profile', 1);
Spike mirror to palace (E). After recording a spike with permanent=1, mirror it into palace_drafts so Machine 3 can surface it for approval alongside other palace writes:
result='confirmed' → addPalaceDraft(cwd, forgeId, { draftType: 'kg_add', payload: { subject, predicate, object: actual }, sourceSpikeId: <spike.id> })result='refuted' → addPalaceDraft(cwd, forgeId, { draftType: 'add_drawer', payload: { wing: '<project>', room: 'documentation', content: 'Refuted: <assumption>. Actual: <actual>.' }, sourceSpikeId: <spike.id> })
Drafts are reviewed and committed in Machine 3 (item 5 in Auto-create). Subject to the A runtime degradation contract (skip enqueue if palace_drafts write fails — spike still recorded in forge.db).During RESEARCH or EXECUTE, if a sub-task is discovered:
/forge --spawn "fix batch recovery" --blocks-parent
--blocks-parent: current forge → status 'blocked', blocked_by = childRun /forge-setup once per project to get the most from Forge:
Forge works WITHOUT setup — it just greps whatever exists. Setup makes it better.
After all streams complete, the orchestrator evaluates integration contracts. Two outcomes:
dispatchIntegrationResult(cwd, forgeId, { allPass: true, lesson })All contracts have status='pass'. Calls completeForge(cwd, forgeId, lesson). Re-plan is NOT triggered.
dispatchIntegrationResult(cwd, forgeId, { allPass: false })One or more contracts have status='fail'. Calls reEnterPlanning(cwd, forgeId).
reEnterPlanning does the following:
integration_contracts WHERE status='fail' AND forge_id=?{ iteration, contracts: [...], timestamp } to forges.context.integration_failure_historyforges.iteration and sets phase='bf-plan-draft'iteration >= 5, calls blockForge(cwd, forgeId, 'integration-replan cap (5) exceeded') instead of incrementingIf configJson.integration_disabled is set (or the forge context has integrationDisabled: true), skip the integration gate entirely — neither completeForge nor reEnterPlanning is invoked via this path. The forge proceeds directly to VERIFY.
reEnterPlanning(cwd, forgeId)
dispatchIntegrationResult(cwd, forgeId, { allPass: boolean, lesson?: string })