Help us improve
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
By atomicstrata
Enforce a falsification-driven research pipeline for empirical ML and evaluation work—transform vague ideas into preregistered hypotheses, reproduce baselines, run adversarial attacks, apply statistical rigor, and gate decisions on kill-or-ship based on repository evidence.
npx claudepluginhub atomicstrata/epistemic --plugin epistemic-skillsUse when a claim, result, or draft compares your system against an external baseline, published paper number, or competitor result and you need a sourced local reproduction before quoting it.
Use when starting or continuing empirical/ML research, evals, benchmarks, or any "is X better than Y?" claim — the umbrella mechanism that enforces hypothesis → preregistration → baseline → experiment → statistical rigor → falsification → kill-or-ship → verification, with gates you self-enforce.
Use when a preregistered hypothesis is ready to run and you must generate provisional evidence under the locked method without contaminating headline outputs.
Use when a numerical claim, comparison claim, or headline result is about to leave `smokes/` or be written into `RESULTS.md`.
Hugging Face Hub CLI (`hf`) for downloading, uploading, and managing models, datasets, spaces, buckets, repos, papers, jobs, and more on the Hugging Face Hub. Use when: handling authentication; managing local cache; managing Hugging Face Buckets; running or scheduling jobs on Hugging Face infrastructure; managing Hugging Face repos; discussions and pull requests; browsing models, datasets and spaces; reading, searching, or browsing academic papers; managing collections; querying datasets; configuring spaces; setting up webhooks; or deploying and managing HF Inference Endpoints. Make sure to use this skill whenever the user mentions 'hf', 'huggingface', 'Hugging Face', 'huggingface-cli', or 'hugging face cli', or wants to do anything related to the Hugging Face ecosystem and to AI and ML in general. Also use for cloud storage needs like training checkpoints, data pipelines, or agent traces. Use even if the user doesn't explicitly ask for a CLI command. Replaces the deprecated `huggingface-cli`.
Executes bash commands
Hook triggers when Bash tool is used
Share bugs, ideas, or general feedback.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Permanent coding companion for Claude Code — survives any update. MCP-based terminal pet with ASCII art, stats, reactions, and personality.
Intelligent prompt optimization: injects the right context at the right moment so Claude lands a better first output. Clarifies vague prompts with research-based questions, plus targeted nudges for approach selection, plan readability, workflow routing, background execution, subagent routing, output readability, user-decision questions, and plan-mode assessment
Complete creative writing suite with 10 specialized agents covering the full writing process: research gathering, character development, story architecture, world-building, dialogue coaching, editing/review, outlining, content strategy, believability auditing, and prose style/voice analysis. Includes genre-specific guides, templates, and quality checklists.
Persistent semantic memory for Claude Code — user preferences, project context, prior decisions, and codebase facts that survive across sessions.
Persistent semantic memory for Claude Code — user preferences, project context, prior decisions, and codebase facts that survive across sessions.
Ξ epistemic
The open source research-discipline coding agent.
epistemic gives your coding agent the norms of good ML research. Instead of running experiments, eyeballing a number, and moving on, it enforces a real method: pre-register a hypothesis, reproduce the baseline, run the experiment, attack your own claim, then decide to ship or kill — with an interactive monitor and gates that make the rules automatic.
The skills are the portable manual the agent follows step by step. The harnesses inject that manual into Claude Code, Codex, or the epistemic TUI. The gates are the safety net that enforces it where the harness supports runtime hooks.
Give your agent epistemic: Claude Code, Codex CLI, Codex App, epistemic TUI.
It starts from the moment your agent picks up an empirical task. Instead of jumping straight to running code, it steps back and asks what you're really trying to prove.
Once it has a rough claim, it asks one question at a time — Socratic-style — until the hypothesis is falsifiable, the falsifier is concrete, and the budget is realistic. Before locking in, it generates 2–3 competing explanations with unique disconfirming predictions so you pick the strongest one.
After you sign off, the agent locks the hypothesis in a pre-registration file before touching any experiment code. The prereg gate then blocks any experiment-shaped command that has no matching prereg.md — so there's no way to accidentally run something unregistered.
From there it reproduces the competitor's baseline under your locked judge, runs the full experiment, applies proper statistics, and sends the claim to adversary models that each try to disprove it. If any adversary succeeds, the result is blocked. If all pass, it lands in RESULTS.md and you decide: ship, kill, pivot, or refine.
Because the skills trigger automatically, you don't need to orchestrate anything. Your coding agent just has epistemic.
$ epistemic "does LoRA at rank 8 outperform rank 4 on math benchmarks"
→ Opens research-question skill, refines to a falsifiable hypothesis
$ epistemic "run the registered experiment H-003"
→ Checks prereg gate, routes to the correct compute target, logs costs
$ epistemic monitor
→ Full-screen experiment tree: running, shipped, killed, pending
$ epistemic fleet
→ Launches a parallel agent fleet across all pending hypotheses
$ /skill:falsification-review
→ Sends the current claim to ≥2 adversary models; blocks if any falsify it
$ /skill:kill-or-ship
→ Decision gate: KILL / PIVOT / REFINE / RECOMMIT / SHIP
Ask naturally or use slash commands as shortcuts.
| Command | What it does |
|---|---|
/skill:research-question | Refine a rough idea into a falsifiable, pre-registerable hypothesis |
/skill:preregistration | Lock hypothesis, judge config, and compute scaffold before running |
/skill:baseline-reproduction | Reproduce the competitor's result under your locked judge |
/skill:experiment-execution | Run with discipline — locked env, full sample, cost logging |
/skill:statistical-rigor | Effect sizes, test selection, multiple-comparison correction, APA reporting |
/skill:falsification-review | Adversary models try to disprove the claim; blocks if any succeed |
/skill:surprise-triage | Diagnose results that diverge >15% before they reach RESULTS.md |
/skill:kill-or-ship | Final decision gate with five outcomes |
/skill:verification-before-publication | Full pre-publish checklist |
curl -fsSL https://raw.githubusercontent.com/moralespanitz/epistemic/master/install.sh | sh
To pin a version:
curl -fsSL https://raw.githubusercontent.com/moralespanitz/epistemic/master/install.sh | sh -s -- --ref v1.0.0
Installs to ~/.epistemic with a symlink in ~/.local/bin/epistemic. Requires Node.js v18+.
Default model: openrouter/deepseek/deepseek-v4-pro. If you have OpenAI Codex authed, it uses that instead. If nothing is authed, pi prompts /login.
Installs just the research methodology skills — no TUI, no runtime dependency.
curl -fsSL https://raw.githubusercontent.com/moralespanitz/epistemic/master/install-skills.sh | sh
With the optional Hugging Face skills:
curl -fsSL https://raw.githubusercontent.com/moralespanitz/epistemic/master/install-skills.sh | sh -s -- --hf