Search everything...

Stats

Actions

Available In

auto-research

LLM研究のフルライフサイクル (literature → idea → experiment → paper) を自動化する Claude Code プラグイン。Evaluation/Benchmark, Agent/Tool-use, Fine-tuning/Post-training, Prompt/ICL, Attention・LLMアーキテクチャ内部研究をカバー。Python (uv + PyTorch + HuggingFace) 前提、8-phase / 4-gate のユーザーゲート付き自動ワークフロー。

Publisher marketplaceauto-research@auto-research · marketplace and plugin share one repository (0h-n0/auto-research)

npx claudepluginhub 0h-n0/auto-research --plugin auto-research

Popularity

Stars

Med: 0·Avg: 495

Copy clicks

Med: 0·Avg: 1

What's Inside

Slash Commands8

実行手順

/lessons-search

Cross-project Lessons DB (~/.research-lessons.json) を free text + tag filter で検索。過去 project の institutional memory を新 project の Phase 3 / 6 で再利用。(v0.15.0+)

実行手順

/notebook-viz

MD ファイルを SoT として残しつつ、MkDocs material で `.research/<slug>/viz/` に視覚化された HTML site をビルド。events.jsonl は Chart.js で time-series chart、LAB_NOTEBOOK の Tags は tag plugin で逆引き、STATE.json は Phase progress bar 化。(v0.17.0+)

前提条件

/research-design

Gap 分析・アイディア抽出と実験設計 (Phase 3-4)。research-gap-finder ×3 並列 → experiment-designer。

前提条件

/research-experiment

実験スキャフォールド (TDD) と本番実行 (Phase 5-6)。ml-engineer + research.experiment.run + result-statistician + (attention-analyst)。

前提条件

/research-review

Self-Review (Phase 8)。research-gap-finder を reviewer モードで invoke + gemini で最新差分確認。Gate G4 で公開判断。

Agents6

Agent Dispatch Matrix

/DISPATCH_MATRIX

`auto-research` プラグインで使う **5 specialist subagents** と既存ユーザー側エージェント (`arxiv-mcp-agent`, `ml-engineer`) の役割境界を 1 表にまとめる。

attention-analyst

/attention-analyst

LLM 内部の mechanistic interpretability 専門エージェント。logit lens / attention pattern / activation patching / path patching / probing classifier / SAE feature lookup を TransformerLens (≤7B) や nnsight (>7B) で実行し、causal な機構解釈を提供する。 ml-engineer は LLM を black-box として扱うが、こちらは glass-box として扱う。 Use when: focus_area=attention のプロジェクトで Phase 5/6 に内部解析を行うとき、または「なぜモデル X は Y で失敗するのか?」のような mechanistic な問いに答えるとき。 <example> Context: SFT 後のモデルで TriviaQA 性能が落ちた。何が壊れたか調べたい。 user: "After SFT on math, TriviaQA dropped 8 points. Why?" assistant: "I'll dispatch attention-analyst to compare attention/MLP activations between base and SFT checkpoints on TriviaQA examples and localize the drift." </example> <example> Context: ICL の挙動を内部から説明したい。 user: "Does Llama-3 8B have induction heads at layer 5?" assistant: "Using attention-analyst to run the standard induction-head detection protocol and report per-head scores with shuffled-control." </example> Do NOT use for: 訓練/fine-tuning 実装 (→ ml-engineer), 論文検索 (→ arxiv-mcp-agent), 本番推論最適化 (→ ml-engineer), 統計検定 (→ result-statistician)。

experiment-designer

/experiment-designer

採択された research idea を「RQ → falsifiable hypothesis → 因子表 → ablation matrix → primary/sanity metric → 統計検定 → seeds plan → GPU-h 見積」の完全な実験計画に変換する仕様化専門エージェント。実装 (Green phase) は ml-engineer に handoff する。 Use when: auto-research Phase 4、03_IDEAS.md からアイディアが採択された直後。 <example> Context: idea を採択して実験フェーズに入りたい。 user: "Idea 2 を採択しました、Phase 4 に進めて。" assistant: "Dispatching experiment-designer to convert Idea 2 into an executable 04_EXPERIMENT_PLAN.md with ablation matrix, primary metric, statistical test, and GPU-h budget." </example> <example> Context: 予算が厳しいので smaller ablation を設計してほしい。 user: "We only have 50 GPU-h. Re-design the ablation for Idea 1 within that." assistant: "experiment-designer will produce a budget-aware plan: smaller model variants, LoRA, subset eval, and reduced seed count with explicit power-analysis caveats." </example> Do NOT use for: 実装 (→ ml-engineer), 統計分析 (→ result-statistician), 内部解析 (→ attention-analyst)。

paper-deep-reader

/paper-deep-reader

単一 arXiv 論文を derivation-level (数式・アルゴリズム・ablation 表復元) で深掘り読解し、 research.literature.matrix の固定スキーマで note を生成する専門エージェント。 Use when: arxiv-mcp-agent が breadth-first で候補を出した後、上位 3-5 本を並列に depth-first 読解する必要があるとき。 <example> Context: auto-research Phase 2 で文献サーベイ後、特定論文を細かく読みたい。 user: "Just did a breadth survey of 12 papers — please deep-read 2403.12345 and 2406.99999." assistant: "Dispatching paper-deep-reader x 2 in parallel: each will produce a structured note in 02_SURVEY/notes/." </example> <example> Context: 自分の研究と関連深い論文の方法を再現したい。 user: "Reproduce method of 2310.12345" assistant: "I'll start by sending paper-deep-reader to extract the equations, hyperparameters, and an algorithmic outline before we touch code." </example> Do NOT use for: breadth-first 検索 (→ arxiv-mcp-agent), cross-paper synthesis (→ research-gap-finder), 実装そのもの (→ ml-engineer)。

research-gap-finder

/research-gap-finder

5 本以上の論文ノートと比較表 (MATRIX.md) を読み、未検証セル・矛盾・隣接領域との接続から研究ギャップを発見し、新規アイディアを novelty / feasibility / impact でスコアリングする cross-paper synthesis 専門エージェント。Phase 8 では reviewer モードで論文の弱点を指摘する。 Use when: auto-research Phase 3 (gap & ideation) で並列 dispatch、または Phase 8 (self-review) で論文ドラフトをレビューするとき。 <example> Context: 文献サーベイが終わってアイディア出しをしたい。 user: "We've deep-read 6 papers, can we identify research gaps?" assistant: "I'll dispatch research-gap-finder x 3 in parallel with different seed angles (gap-by-cell / by-contradiction / by-adjacency) and merge their outputs into 03_IDEAS.md." </example> <example> Context: 自分の論文ドラフトを reviewer 視点で批評してほしい。 user: "Self-review the draft as if you were an ICLR reviewer." assistant: "Engaging research-gap-finder in reviewer mode: it will produce 08_REVIEW.md with Soundness / Presentation / Contribution / Reproducibility ratings and likely questions." </example> Do NOT use for: 単一論文の深掘り (→ paper-deep-reader), 文献検索 (→ arxiv-mcp-agent), 実装 (→ ml-engineer), 統計分析 (→ result-statistician)。

Skills16

auto-research

/auto-research

LLM研究のフルライフサイクル (literature → idea → experiment → paper) を 8 phases / 4 user gates のステートマシンで進行するメインワークフロー。 Use when: LLMに関する新規研究を「アイディアの種」「論文URL」「研究テーマ文字列」のいずれかから始めて論文ドラフトまで一気通貫で進めたい場合。フォーカス: Evaluation/Benchmark, Agent/Tool-use, Fine-tuning/Post-training, Prompt/ICL, Attention・LLMアーキテクチャ内部研究。 NOT for: 単発の論文要約 (→ arxiv-mcp-agent)、本番ML推論パイプライン (→ ml-engineer)、 Rustクレート生成 (→ rlac-create)。入力: 自然言語の研究テーマ | arXiv URL | 既存 .research/<slug>/STATE.json への再開

research.attention.probe

/research.attention.probe

TransformerLens / nnsight ベースの mechanistic interpretability セットアップ。介入プロトコル (logit lens, attention pattern, activation patching, path patching, probing, SAE feature lookup) を `analysis/<slug>.py` 1 ファイル原則 + 活性キャッシュで効率的に実行。 Use when: focus_area=attention の auto-research プロジェクトで Phase 5/6 に内部解析を行うとき、または attention-analyst agent の前段で環境を整えるとき。

research.autonomous.swarm

/research.autonomous.swarm

research.autonomous.tinker (v0.9.0) の拡張。N agents (default 3) を並列に走らせ、各 agent に異なる探索戦略 (depth-explore / lr-explore / arch-explore / batch-explore / random-restart) を割り当てて diversity を確保する research org モード。全 agent の BEST.json を `swarm_orchestrate.sh` が定期集約し SHARED_BEST.json で global best を共有。 Use when: 単一 agent の tinker (v0.9.0) で局所解にハマる、または overnight でより広範な探索空間をカバーしたいとき。

research.autonomous.tinker

/research.autonomous.tinker

karpathy/autoresearch (March 2026, MIT) に着想を得た autonomous tinker mode。 8-phase ワークフローの Phase 5-6 alt mode として、agent が `tinker/train.py` 1 ファイルを反復編集し、固定 wall-clock budget (デフォルト 5 分) で nanochat-style な single-GPU LLM 訓練を overnight 自律探索する。単一比較メトリックは val_bpb (vocab-size-independent)。 Use when: Phase 4 の experiment plan で `mode: tinker` を選んだとき、または「single GPU で寝てる間に LLM training を agent に最適化させたい」とき。

research.compute.shop

/research.compute.shop

指定した workload (gpu_type, count, duration_h) に最適な GPU リソース提供元をランク付け推奨する skill。AWS/GCP/Azure に加え Lambda/RunPod/Vast.ai/Salad/TensorDock/ CoreWeave/DataCrunch 等の marketplace、Colab/Kaggle/HF ZeroGPU の free tier、 GCP TRC/NSF ACCESS/national HPC の academic grant を含む 18 provider を網羅。 Use when: Phase 4 (Experiment Design) で compute estimate が固まった直後、または借りる前に「同じ workload を最安でどこで動かせるか」を知りたいとき。

Hooks1

Event Hooks

Bash

1 hook across 1 event

MCP Servers2

Stats

Version0.18.0

ReleasedMay 11, 2026

LanguageShell

Stars0

MaintenanceExcellent

LicenseMIT

Last CommitMay 11, 2026

AddedMay 9, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Available In

auto-research

Safety Signals

Caution

Executes bash commands

Hook triggers when Bash tool is used

Requires secrets

Needs API keys or credentials to function

Uses power tools

README

auto-research

🇬🇧 English version

LLM 研究のフルライフサイクルを Claude Code 上で一気通貫で進めるためのプラグイン。論文サーベイ → アイディア検証 → 実験設計・実装・実行 → 論文ドラフトまでを 8 phases / 4 user gates の自動ワークフローでカバーする。

研究フォーカス領域:

Evaluation & Benchmarking
Agent / Tool-use 研究
Fine-tuning / Post-training (SFT, RLHF, DPO, LoRA, ...)
Prompt / In-context Learning
Attention 機構・LLM アーキテクチャ内部 (mechanistic interpretability)

前提

Python (uv) + PyTorch + HuggingFace Transformers
既存設定済みの arxiv-mcp-server (本プラグインは同梱しない)
(推奨) Semantic Scholar / HuggingFace Hub / GitHub の API トークン

インストール (ローカル)

このプラグインはまだ marketplace に公開していません。お手元のディレクトリから直接インストールできます。公式ドキュメント: https://docs.claude.com/en/docs/claude-code/plugins

方法 A: ローカル marketplace 経由 (Recommended、永続)

.claude-plugin/marketplace.json を同梱しています。Claude Code セッション内で:

/plugin marketplace add ~/path/to/auto-research
/plugin install auto-research@auto-research

/plugin marketplace add <path> でこのディレクトリを marketplace として登録
/plugin install auto-research@auto-research で実プラグインを有効化 (<plugin-name>@<marketplace-name> 形式。両方 auto-research なので二重)

確認:

/plugin list           # 有効なプラグイン一覧
/help                  # /auto-research:research-start ... が見えれば成功

無効化したいときは /plugin uninstall auto-research@auto-research、 marketplace ごと外すなら /plugin marketplace remove auto-research。

方法 B: CLI フラグ (一時的、設定変更なし)

そのセッションだけ有効化したい場合:

claude --plugin-dir ~/path/to/auto-research

Claude Code を起動するたびに指定が必要。検証用途向け。

方法 C: シンボリックリンク (手動、最小)

marketplace を使わずユーザースコープに直接展開:

PLUGIN=~/path/to/auto-research

# skills を ~/.claude/skills/ に
for d in "$PLUGIN"/skills/*/; do
  ln -sf "$d" "$HOME/.claude/skills/$(basename "$d")"
done

# agents を ~/.claude/agents/ に
for f in "$PLUGIN"/agents/*.md; do
  ln -sf "$f" "$HOME/.claude/agents/$(basename "$f")"
done

# commands を ~/.claude/commands/ に
mkdir -p "$HOME/.claude/commands"
for f in "$PLUGIN"/commands/*.md; do
  ln -sf "$f" "$HOME/.claude/commands/$(basename "$f")"
done

PostToolUse hook と .mcp.json は手動マージが必要 (~/.claude/settings.json と ~/.claude.json を編集)。シンプルさを取るなら方法 A 推奨。

動作確認

/help

以下が表示されれば成功:

/auto-research:research-start    新規 LLM 研究プロジェクトを開始 ...
/auto-research:research-design   Gap 分析・アイディア抽出と実験設計 ...
/auto-research:research-experiment
/auto-research:research-write
/auto-research:research-review
/auto-research:research-status

skill / agent も /skills および Agent ツールから参照できることを確認:

> 「auto-research skill を使って "test topic" で Phase 1 の dry-run をして」

テスト (v0.3.0+)

開発時は tests/run_all.sh で smoke / schema / regression を一括実行できます (CI でも自動):

bash tests/run_all.sh
# 8 tests pass / 0 fail を期待

依存: bash, jq, python3 + jsonschema (or uv)。詳細は tests/README.md。

MCP サーバーの取り扱い

.mcp.json で 動作確認済みの 2 種類 を同梱しています。初回起動時に Claude Code が確認ダイアログを表示するので承認してください。

同梱 (実機検証済み)

MCP server	パッケージ	用途	認証
`semantic-scholar`	`uvx semanticscholar-mcp-server`	Phase 2/8 で論文メタ・引用グラフ補完、refs.bib の DOI 補完	`SEMANTIC_SCHOLAR_API_KEY` (任意; 未設定でも動作するがレート制限が厳しい)
`github`	`npx -y @modelcontextprotocol/server-github`	Phase 5/8 で論文公式コード取得・issue 追跡	`GITHUB_PERSONAL_ACCESS_TOKEN` (任意; 公開 repo のみなら不要)

同梱しない (依存・前提のみ)

arxiv-mcp-server (blazickjp/arxiv-mcp-server): 論文探索・取得・読解の中核。ユーザー側で ~/.claude.json に設定済みであることを前提とします。
HuggingFace Hub: PyPI の huggingface-mcp-server は実機検証で匿名スキャフォールド (no author / no docs / Python>=3.13 のみ) と判明したため v0.3.0 で同梱を停止しました。代わりに、実験コード内で huggingface_hub Python ライブラリ経由で直接アクセスするか、ユーザーが信頼できる HF MCP を独自に設定してください。

未使用にしたい場合

~/.claude/settings.json の enabledMcpjsonServers から外す
または関連環境変数を未設定のままにする (認証なし fallback で動作)

アンインストール

方法 A:

/plugin uninstall auto-research@auto-research
/plugin marketplace remove auto-research

方法 C (symlink) で入れた場合:

# ~/.claude/skills, agents, commands から auto-research 関連の symlink を削除
find ~/.claude/skills ~/.claude/agents ~/.claude/commands -lname "*my-plugins/auto-research*" -delete

クイックスタート

/auto-research:research-start "attention sink in long-context Llama"
# Phase 1 (Topic Framing) → G1
# Phase 2 (Literature Survey) → MATRIX.md 生成

/auto-research:research-design
# Phase 3 (Gap & Ideation) → G2
# Phase 4 (Experiment Design) → G3

/auto-research:research-experiment
# Phase 5 (Scaffold + Baseline TDD)
# Phase 6 (Run & Analysis)

View full README on GitHub

auto-research

Popularity

What's Inside

Confidence

README

auto-research

前提

インストール (ローカル)

方法 A: ローカル marketplace 経由 (Recommended、永続)

方法 B: CLI フラグ (一時的、設定変更なし)

方法 C: シンボリックリンク (手動、最小)

動作確認

テスト (v0.3.0+)

MCP サーバーの取り扱い

同梱 (実機検証済み)

同梱しない (依存・前提のみ)

未使用にしたい場合

アンインストール

クイックスタート

Similar Plugins

ecc

supergraph

octo

claude-code-harness

compound-engineering

superpowers

auto-research

前提

インストール (ローカル)

方法 A: ローカル marketplace 経由 (Recommended、永続)

方法 B: CLI フラグ (一時的、設定変更なし)

方法 C: シンボリックリンク (手動、最小)

動作確認

テスト (v0.3.0+)

MCP サーバーの取り扱い

同梱 (実機検証済み)

同梱しない (依存・前提のみ)

未使用にしたい場合

アンインストール

クイックスタート

Popularity

Health & Quality

Similar Plugins

ecc

supergraph

octo

claude-code-harness

compound-engineering

superpowers