From grainulator
Scores past predictions against actual sprint outcomes, creates calibration claims, computes accuracy scorecards by evidence tier and claim type. Useful for feedback loops after implementations.
npx claudepluginhub grainulation/grainulator --plugin grainulatorThis skill uses the workspace's default tool permissions.
The user wants to check what actually happened after a sprint's recommendations were implemented.
Scores completed OKR sets at cycle close with KR-level grading by type (committed, aspirational, learning, etc.), evidence quality assessment, learning synthesis, and next-cycle recommendations. Enforces rules against retroactive changes and gaming.
Analyzes sprint claims for type distributions, evidence quality tiers, stale claims over 7 days, velocity metrics, and prediction scoring. Generates HTML retrospective reports.
Generates dev cycle feedback reports: calculates assertiveness scores, analyzes prompt quality, aggregates metrics, root cause analysis on failures, outputs to docs/feedbacks/cycle-{date}/.
Share bugs, ideas, or general feedback.
The user wants to check what actually happened after a sprint's recommendations were implemented.
$ARGUMENTS
Expected format: /calibrate --outcome "what happened" or /calibrate <claim_id> "actual result"
Parse the outcome: The user provides outcome data as free text or claim-specific results.
Match outcomes to predictions: Use wheat_search to find the original estimate, recommendation, or risk claims that predicted something. Compare prediction to actual outcome.
Create calibration claims as cal### claims with evidence tier production (these are real outcomes):
Compute accuracy scorecard:
stated vs web vs documented vs tested claims were accurate?Run wheat_compile.
Print scorecard:
Calibration results:
Predictions scored: <N>
Accurate: <N> (<percent>)
Partially accurate: <N>
Wrong: <N>
Accuracy by evidence tier:
stated: <percent>
web: <percent>
documented: <percent>
tested: <percent>
Next steps:
/brief -- recompile with calibrated data
/research <topic> -- investigate where predictions went wrong