By 88plug
Falsification-first investigation workflow: convert every assertion into a labeled falsifiable hypothesis, predict before measuring, run controlled experiments, verify findings adversarially (REFUTE-first), and persist verdicts in a hypothesis ledger so killed ideas are never re-attacked
Convene a model council: the same question answered independently across different models, dissent surfaced, cruxes routed to probes
Attack an asserted limit, ceiling, or claim with designed experiments
Run a full invention campaign: ideate past a limit, refute, build, measure against acceptance criteria, provenance-search, certify
Run a full scientific investigation: hypotheses, controlled experiments, verdicts, ledger
Create or update the persistent hypothesis ledger (EXPERIMENTS.md) for this project
Use this agent as one independent seat in a model council — the same question is posed to several seats, each running on a different model (pass a different model override per spawn), blind to each other's answers. The seat investigates with its own read-only probes and returns a structured position with evidence, calibrated confidence, and an explicit "what would change my mind". Spawn 3-5 seats in parallel for judgment calls: design decisions, interpretation of ambiguous results, risk assessments, go/no-go calls. Do not use a council to settle purely empirical questions — those go to experiments.
Use this agent to turn one asserted limit, claim, or candidate root cause into a rigorously designed experiment — without running it. It returns a falsifiable hypothesis with an explicit null, a written probe artifact, a pre-committed outcome→conclusion table, and what each outcome unlocks. Spawn one per hypothesis when fanning out a falsification campaign (the parent runs the probes serially for clean numbers). Use whenever an investigation has accumulated untested assertions ("the ceiling is X", "the daemon causes Y", "Z can't work") that need designed experiments rather than debate.
Use this agent as the area chair who closes a peer-review round. It receives the submission packet, all independent reviews, the author rebuttal, and any re-scores, then issues the final decision (accept / minor-revision / major-revision / reject) with camera-ready requirements. It weighs evidence quality rather than counting votes — one reviewer with a failed reproduction outweighs three approving skims. Spawn exactly one, after rebuttal, never before all reviews are in.
Use this agent as one independent reviewer in a peer-review round for an invention, design, finding, or paper-style writeup. Each reviewer gets the same submission packet plus ONE assigned lens (soundness, prior-art/provenance, reproducibility, significance, or fatal-flaw) and works blind to the other reviewers. Reviews are execution-grounded — the reviewer runs the Reproduce block, searches prior art, or re-derives the numbers depending on lens — and return structured scores plus an accept/revise/reject recommendation. Spawn 3-5 with different lenses after a finding survives the refute gate and before it is built, merged, published, or sent externally.
Use this agent to evaluate a finding, claimed root cause, performance claim, invention, or research result by attempting to refute it before judging what survives. It returns a verdict (confirmed / prototype / research / kill) with the refutation analysis, calibrated confidence, and a kill_reason when applicable. Spawn one fresh refuter per claim — the author of a finding must not referee it. Use proactively before acting on any finding, sending conclusions externally, merging a "fix", or relaying research-agent results the parent has not independently verified.
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
No model invocation
Executes directly as bash, bypassing the AI model
No model invocation
Executes directly as bash, bypassing the AI model
A Claude Code plugin that runs investigations, debugging, performance work, and claim validation as falsification-first campaigns — for engineers who need to be right, not just confident.
/plugin marketplace add 88plug/scientific-method
/plugin install scientific-method@scientific-method
Run a full campaign on any problem in one command:
/scientific-method:investigate the API returns 500 under load but not in tests
You get back labeled hypotheses (H1..Hn), a prediction for each written before any measurement, the cheapest probe run first, and a calibrated verdict — with every result logged to a persistent ledger so killed ideas stay killed.
No setup, no API keys, no MCP server. The plugin enforces method over your existing tools.
Most debugging is guessing dressed up as analysis. This plugin makes Claude work like a scientist: turn each assertion into a falsifiable hypothesis, predict the outcome before measuring, run a controlled experiment, attack the result before trusting it, and record the verdict so it survives across sessions.
It is distilled from real session transcripts where the method cracked problems ordinary debugging did not — a GPU codec campaign that falsified four asserted "physical" performance walls, a fleet forensics investigation that killed two plausible-but-wrong root causes with control cases before filing a vendor bug, and benchmark work where honest baselines caught regressions that averages hid.
[!NOTE] This is a methodology plugin. It ships a skill, commands, agents, and one read-only hook — no MCP server, output style, or statusline. The hook is the only thing that runs automatically, and it only reads
EXPERIMENTS.md.
EXPERIMENTS.md) with a
falsification log marked DO-NOT-RE-ATTACK, so killed ideas stay killed across
sessions and compactions.Each stage maps to a command you can run on its own, or that investigate
chains for you.
| Command | What it does |
|---|---|
/scientific-method:investigate <problem> | Full campaign: hypotheses, controlled experiments, verdicts, ledger |
/scientific-method:falsify <claim> | Attack an asserted limit, ceiling, or claim with designed probes |
/scientific-method:invent <problem> | Invention campaign: ideate past a limit, refute, build, measure vs tuned baseline, provenance-search, certify |
/scientific-method:verdict [claims] | Adversarial REFUTE-first review of findings before they are trusted |
/scientific-method:ledger [sync] | Create or update the persistent hypothesis ledger |
/scientific-method:council <question> | Model council: same question to independent seats on different models, dissent surfaced, factual cruxes routed to probes |
/scientific-method:peer-review <work> | Blind lensed reviewers who execute, rebuttal answered with evidence, area-chair decision |
These run under the hood when a campaign fans out. You can also invoke them directly.
npx claudepluginhub 88plug/claude-code-plugins --plugin scientific-methodUltra-compressed communication mode. Cuts ~75% of tokens while keeping full technical accuracy by speaking like a caveman.
Eyes and hands on a Linux Wayland desktop: screenshot any monitor and click, type, scroll, drag, and read any visible app over xdg-desktop-portal (RemoteDesktop + ScreenCast), with optional OCR + OmniParser icon grounding. Pure-Python, CPU-only. GNOME/Wayland only. Ships the MCP server plus a drive-screen skill that encodes the locate-ground-act-confirm loop.
Seamless context continuity across Claude Code compaction. Four background layers (continuous tool-call capture, mechanical PostCompact handoff, async Opus 4.7 enrichment at --effort max, async Stop-hook refinement, and preemptive snapshot before the next compact) keep the agent's working state intact across every compaction and resume. All summarization is isolated from CLAUDE.md/auto-memory and invisible to the user.
Fast, token-efficient MCP for SearXNG metasearch. Privacy-respecting search across 70+ engines with stdio + streamable-http transports, Docker, and optional rendered (Playwright) fetch for JS-heavy pages. Self-hostable. The underlying server is also usable independently of Claude Code via `uvx --from git+https://github.com/88plug/searxng-mcp searxng-mcp`.
Detects when Claude drifts away from your active output contract (a terse persona, hard formatting/length rules, an in-character voice) and quietly steers it back. Each assistant turn is scored by a deterministic, dependency-free engine; a status-line badge shows live drift and the next prompt gets a one-shot correction nudge when the previous reply broke contract.
Harness-native ECC plugin for engineering teams - 67 agents, 277 skills, 92 legacy command shims, reusable hooks, rules, MCP conventions, and operator workflows for Claude Code plus adjacent agent harnesses
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
Access thousands of AI prompts and skills directly in your AI coding assistant. Search prompts, discover skills, save your own, and improve prompts with AI.
Upstash Context7 MCP server for up-to-date documentation lookup. Pull version-specific documentation and code examples directly from source repositories into your LLM context.
Core skills library for Claude Code: TDD, debugging, collaboration patterns, and proven techniques
Develop, test, build, and deploy Godot 4.x games with Claude Code. Includes GdUnit4 testing, web/desktop exports, CI/CD pipelines, and deployment to Vercel/GitHub Pages/itch.io.