deep-research

A Claude Code plugin for rigorous web research using three independent LLM families as cross-checkers, worktree-sandboxed for security, and primary-source verified via mechanical URL + passage checks.

The single goal: produce research output you can actually trust to make decisions on — by hardening against the failure modes single-pipeline LLM research is known to produce (consensus hallucination, fabricated citations, prompt injection from web content, CLI/API confabulation).

/research <your question>

What it does

/research spawns one researcher agent in worktree isolation. The agent then runs three parallel research streams:

Claude (you) — WebSearch + WebFetch
Gemini 3.1 Pro — via the agy (Antigravity) CLI, sandboxed (--sandbox)
GPT-5 — via OpenAI's codex CLI, sandboxed (-s read-only)

Each stream uses a different web-search backend (Google Search inside Gemini, OpenAI's web tool inside Codex, Claude's WebSearch). After all three return, the researcher triages four lists — Agreements / Gemini-only / GPT-only / Conflicts — and resolves each one via primary-source verification before including it in the final report.

The defense model is evidence weight > vote count. One cross-checker with a primary-source URL that passes mechanical verification (URL resolves 2xx + quoted passage actually appears in the fetched page) beats a two-way LLM consensus with no source. Why this matters is documented in the literature on consensus hallucination — when training corpora overlap heavily, LLM agreement-without-evidence is a weaker signal than minority-dissent-with- evidence (Shared Imagination, arXiv:2407.16604; Till et al. 2025).

Defenses

Threat	Defense
Consensus hallucination (overlapping LLM training corpora produce confident agreement on false facts)	3 different LLM families with different search backends + primary-source verification on every retained claim
Fabricated citations (LLM cites a paper / URL that doesn't exist or doesn't say what was claimed)	Rule C: `curl -sIL` HTTP code check on every URL retained + passage match (re-fetch the URL, confirm quoted value appears verbatim)
CLI / API / config-syntax confabulation (LLM "remembers" a flag that doesn't exist)	Rule B: empirical `<cmd> --help` or vendor-doc fetch before claiming a flag exists
Prompt injection from web content	Output wrapped in `<external-research trust="untrusted">` tags before main session processes it; researcher is reminded that fetched HTML is data, not instructions; web-suggested shell commands never executed
Code execution from compromised pages	Worktree isolation on the researcher (sandboxed copy of repo); `agy --sandbox` enables kernel-level terminal restrictions; `codex -s read-only` enforced via macOS Seatbelt / Linux bubblewrap
`$TMPDIR` race condition between concurrent `/research` sessions	Each researcher generates a unique `RUN_ID` via `date +%s%N` and namespaces all tempfile paths with it

Installation

This is a local Claude Code plugin. Drop it under your plugins directory and enable it.

# Clone into local plugins
mkdir -p ~/.claude/plugins/local/plugins
cd ~/.claude/plugins/local/plugins
git clone https://github.com/oh-rid/deep-research.git

Restart Claude Code (or reload plugins) and /research becomes available.

Requirements

The plugin works in degraded mode without optional cross-checkers, but for full 3-way triangulation you need:

agy (Antigravity CLI) — runs Gemini models via Google AI Pro. Install via Homebrew or the official installer. Model is set in ~/.gemini/antigravity-cli/settings.json ("model": "Gemini 3.1 Pro (High)" is the recommended choice; Flash variants may return empty on long academic prompts).
codex (OpenAI Codex CLI) — runs GPT-5. Web search must be enabled in ~/.codex/config.toml:
```
[tools.web_search]
enabled = true
```

If either CLI is missing the researcher empirically detects it (which agy / which codex) and proceeds with the remaining cross-checker plus its own WebSearch.

Usage

/research what's the current academic consensus on r-star (natural rate of interest)?

/research is the Brand-Goy-Lemke SUERF Policy Brief 1360 about the US or the euro area?

/research what did the latest FOMC minutes say about rate hikes, and how did the curve react?

The slash command spawns the researcher, wraps the result in an untrusted-source block, and presents a structured final answer to you with:

deep-research

Popularity

What's Inside

README

deep-research

What it does

Defenses

Installation

Requirements

Usage

Confidence

Similar Plugins

insane-research

greedysearch

co-researcher

researcher

deep-research

octo

Popularity

Health & Quality

Similar Plugins

insane-research

greedysearch

co-researcher

researcher

deep-research

octo