From kernel
Validates development rules using scientific method: register hypotheses, design experiments, execute, score confidence, and graduate or kill rules based on evidence.
How this skill is triggered — by the user, by Claude, or both
Slash command
/kernel:experimentThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
<skill id="experiment">
Seed — parse CLAUDE.md for imperative/assertive statements → register as hypotheses
agentdb learn hypothesis "H{N}: {statement}" "{source}:{line}"Design — for each hypothesis under test, define experiment BEFORE running anything
Execute — run the experiment; record raw observations
Score — apply Bayesian update to confidence
confidence += (1 - confidence) * 0.25confidence -= confidence * 0.3agentdb learn experiment "H{N} result={supports|refutes|inconclusive}" "{evidence}"Transition — update hypothesis status per lifecycle rules
Report — surface verdict
<anti_patterns> Confirm a hypothesis without running a real experiment. Use a single data point to graduate a hypothesis. Ignore refuting evidence because the rule "feels right". Test a hypothesis with a method that can only confirm (design for falsifiability). Modify the hypothesis after seeing results (that is a new hypothesis). </anti_patterns>
<on_complete> agentdb write-end '{"skill":"experiment","hypotheses_tested":N,"graduated":[],"killed":[],"inconclusive":[]}' </on_complete>
npx claudepluginhub ariaxhan/kernel-claude --plugin kernelRuns safe-to-fail probes for complex domain hypotheses: qualifies constraints/criteria in foreground, then background experiments to sense retrospective cause-effect. Triggers on 'probe', 'test hypothesis'.
Reviews program.md experimental methodology for hypothesis clarity, measurement validity, control adequacy, scope, and strategy fit; emits APPROVED/NEEDS-REVISION/BLOCKED verdict before expensive runs.
Enforces scientific method—observation, falsifiable hypotheses, predictions, experiments, conclusions—for debugging unclear causes, intermittent issues, failed attempts, or uncertain architecture decisions.