Hallucination Detector

Prevents Claude from finishing tasks with speculation, unverified claims, or invented causality.

Why Install This?

Claude regularly delivers responses that sound confident but contain:

Speculation disguised as diagnosis ("This is probably caused by...")
Invented causality without evidence ("The error occurs because...")
Fake quantification without methodology ("This improves performance by 70%")
Completeness overclaims without verification ("All files have been checked")

These patterns are dangerous precisely because they sound authoritative — indistinguishable from real analysis. This plugin forces Claude to ground every claim in actual observations before it can complete a task.

Installation

Quick Install (any agent)

Use the Vercel Skills CLI to install across any supported agent:

npx skills add bitflight-devops/hallucination-detector

Target a specific agent:

npx skills add bitflight-devops/hallucination-detector -a claude-code
npx skills add bitflight-devops/hallucination-detector -a cursor
npx skills add bitflight-devops/hallucination-detector -a codex
npx skills add bitflight-devops/hallucination-detector -a opencode

To uninstall:

npx skills remove hallucination-detector

Platform-Specific Installation

Claude Code (via Plugin Marketplace)

In Claude Code, register the marketplace first:

/plugin marketplace add bitflight-devops/hallucination-detector

Then install the plugin:

/plugin install hallucination-detector@hallucination-detector

Cursor (via Plugin Marketplace)

In Cursor Agent chat, install from marketplace:

/plugin-add hallucination-detector

Codex

Tell Codex:

Fetch and follow instructions from https://raw.githubusercontent.com/bitflight-devops/hallucination-detector/refs/heads/main/.codex/INSTALL.md

Detailed docs: .codex/INSTALL.md

OpenCode

Tell OpenCode:

Fetch and follow instructions from https://raw.githubusercontent.com/bitflight-devops/hallucination-detector/refs/heads/main/.opencode/INSTALL.md

Detailed docs: .opencode/INSTALL.md

Verify Installation

Start a new session in your chosen platform and write a response containing speculation (e.g., "this is probably caused by..."). The plugin should block the response and require evidence-first rewriting.

Documentation

For detailed information about how the plugin works, its architecture, configuration, and internal detection mechanisms, see the Architecture Reference.

The Problem

LLMs like Claude are optimized during training to produce responses that appear helpful and confident. This creates a systematic failure mode:

Speculation as diagnosis - When asked "why did X happen?", Claude draws on training patterns to generate plausible-sounding explanations. These explanations feel authoritative but have no connection to the actual state of your system. Claude hasn't checked logs, read config files, or verified anything — it's pattern-matching from training data.

Invented causality - Causal claims ("X because Y") require evidence showing the relationship. Claude often asserts causality based on what typically causes similar symptoms, not what actually caused this specific instance. The word "because" in Claude's output frequently signals unverified inference.

Fake rigor - Scores and percentages ("8/10 quality", "70% improvement") create an illusion of measurement. Without methodology, sample size, and reproducible criteria, these numbers are meaningless — yet they make responses feel more credible.