Help us improve
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
By maxwell2732
Enforces a structured, reproducible research lifecycle for empirical projects: data auditing, cleaning, analysis, human review, and replication packaging — all within a controlled execution environment with integrity gates and cross-session memory.
npx claudepluginhub maxwell2732/claudecode-research-harness-workflow --plugin claudecode-research-harness-workflowNon-executing advisor that receives advisor-request.v1 from an executor and returns only a decision response.
Research analyst agent — writes and runs R/Stata/Python scripts, saves logs and outputs, updates analysis_plan.md evidence ledger. Never fabricates numbers. Never modifies raw data.
Read-only reviewer for research outputs and software code. For research: checks identification, numerical accuracy, causal claims, cleaning completeness. For software: checks spec alignment, TDD, and security. Never runs code. Never edits files.
Integrated scaffolder that handles project setup in 3 modes: analyze, scaffold, and update-state.
Integrated worker that handles implementation, preflight self-check, verification, and commit preparation — one task at a time.
Browser automation through the repo agent-browser CLI. Explicit helper for navigation, forms, screenshots, scraping, and web-app checks. Prefer Browser Use or Playwright when available. Do NOT load for: sharing URLs, embedding links, or editing screenshot files.
Explicit helper for authentication and payment implementation with Clerk, Supabase Auth, or Stripe. Do NOT load for: general UI work, database design, or non-auth features.
CI red? Call us. Pipeline fire brigade deploys. Use when user mentions CI failures, build errors, test failures, or pipeline issues. Do NOT load for: local builds, standard implementation work, reviews, or setup.
Explicit helper for CRUD scaffolding and API endpoint generation. Do NOT load for: UI components, form design, database schema discussion, or general implementation.
RES: Read-only review of research outputs. Checks identification, model spec, numerical accuracy, causal claims, reproducibility. Produces review_report.md with APPROVE/REQUEST_CHANGES/BLOCK verdict. Trigger: review research, check results, review analysis, verify outputs. Do NOT load for: cleaning, execution, release, setup, audit, planning.
Matches all tools
Hooks run on every tool call, not just specific ones
Share bugs, ideas, or general feedback.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Computational-science methodology for Claude Code: research framing, pre-registration, reproducible analysis, anomaly investigation, and red-team review
Autonomous research orchestration: agents for hypothesis-driven investigation, experiment running, fresh-eyes review, and batch evaluation.
Scientific research agent extension - turns research goals into reproducible Jupyter notebooks with Python REPL, data analysis, and ML workflows
Guardrails your research workflow — checks hypotheses, catches known bugs, flags sloppy methodology.
Self-documenting, self-improving framework for analytical repositories
Research-team agents for Claude Code: supervisor, analysis-implementer, paper-writer, figure-descriptor, reviewer, literature-curator.
Executes bash commands
Executes bash commands
Hook triggers when Bash tool is used
Hook triggers when Bash tool is used
Modifies files
Hook triggers on file write and edit operations
Modifies files
Hook triggers on file write and edit operations
Uses power tools
Uses Bash, Write, or Edit tools
Uses power tools
Uses Bash, Write, or Edit tools
Share bugs, ideas, or general feedback.
面向 Agent 辅助实证研究的受控执行框架。
把 Claude Code 从会写代码的智能体,变成可审计、可复现的研究协作者。
作者: 朱 晨 | 遗传社科研究 Chen Zhu | China Agricultural University (CAU)
最后更新: 2026-06-03
Specify → Audit → Clean → Plan → Work → Review → Release
定义问题 → 审查数据 → 清洗合并 → 制定计划 → 执行分析 → 审阅结果 → 发布复制包
/research-harness-setup ·
/research-harness-audit ·
/research-harness-clean ·
/research-harness-plan ·
/research-harness-work ·
/research-harness-review ·
/research-harness-release
+ 按论文定义提取分析就绪子样本
给定任意已发表论文,从原始调查面板中提取与论文样本筛选、变量构造和编码一致的子样本及 codebook
Claude Code Research Harness 是一个面向 Claude Code 的实证研究执行框架。
它不是一组 prompt,也不是一个普通 workflow,而是一个 Harness:一套由规则、文件、检查点和证据要求组成的受控研究环境。它把 AI 智能体约束在可复现、可审计、可追踪的研究流程中。
普通用法是直接问智能体:
“帮我分析这个数据。”
Claude Code Research Harness 的用法是让智能体在制度化环境里工作:
“只读审查原始数据,不要修改。
先生成清洗计划。
再写可复现脚本。
保存日志。
核验每一个数字。
一旦证据链断裂,就停止。”
核心规则很简单:
No script, no log, no claim.
没有脚本,没有日志,就没有结论。
在开始之前:必须已安装 Claude Code,Python 3 (或 Miniconda)和 git(同时注册 GitHub 账号)。
# 在 GitHub 上克隆此仓库(在仓库页面点击“ Fork ”),然后:
# ( 将“ YOUR_USERNAME ”替换为你自己的 GitHub 用户名 )
git clone https://github.com/maxwell2732/claudecode-research-harness-workflow.git
cd claudecode-research-harness-workflow
【小提示】也可以将本仓库下载(zip文件),本地解压缩。但这种方法无法进行版本控制,故不太推荐。
# 确保已进入本地仓库目录下,如 C:\claudecode-research-harness-workflow, 然后启动 Claude Code :
claude
以数据清理为例,将准备进行清理的原始数据放入到 data/ 文件夹中,然后根据自己需求修改以下 Prompt 并复制粘贴到 CC 中:
我把要清理合并的原始数据 [DATA NAME] 放到
data/里了,请阅读 claude.md 等 配置文件并严格执行,将数据合并为一个面板数据 csv,同时生成 codebook csv,以及执行报告 md,用 /plan mode。
该 Prompt 用途: CC 会阅读仓库中的工作流配置文件,调用相应 agents 和 skill,计划并实施清理流程,生成结果并验证,过程中严格遵循 Harness 思想。
在软件开发和所谓 Vibe coding 语境中,Harness 通常指的是把 AI 编程智能体约束在一套工程化流程里:先写需求规格,再拆分任务,再修改代码,再运行测试,最后通过 review 和 release 检查。它解决的是一个核心问题:不要让 Agent 只是凭感觉写代码,而要让每一次修改都能被测试、回滚和验收。
Claude Code Research Harness 继承了这个思想,但把对象从“软件代码”换成了“实证研究”。在研究场景里,真正危险的往往不是代码语法错误,而是样本构造不透明、数据清洗不可追溯、合并键错误、回归结果无法复现、因果表述超过识别设计、以及正文数字找不到对应日志。因此,Research Harness 不是只要求 Agent “代码能跑”,而是要求它证明:
因此,软件开发中的 Harness 更像是 Coding Harness:它关心代码变更、测试通过和 release readiness。Claude Code Research Harness 则是 Research Harness:它关心研究问题、数据来源、样本构造、识别可信度、结果复现和证据链完整性。
它与 Vibe Research 的关系也类似。Vibe research 强调用自然语言驱动研究代码生成,让研究者可以更快地把想法变成脚本、表格和图形。但如果只有 Vibe,没有 Harness,Agent 很容易把“看起来完成了”误当成“真的可复现”。Research Harness 的作用,就是在 Vibe Research 之外加上一层研究制度:允许研究者用自然语言推进工作,但要求每一步都留下可检查的证据。
简单说:
Research Harness 让 Agent 更负责任地产生研究证据。
或者更直接地说:
软件 Harness 约束 Agent 写代码。
Research Harness 约束 Agent 做研究。
前者追求“代码通过测试”,后者追求“结论经得起追溯”。
AI 智能体已经能帮助研究者写代码、清洗数据、跑模型和生成表格。但如果没有结构性约束,实证研究中的 agent work 很容易出现风险。
常见问题包括:
Claude Code Research Harness 把这些风险转化为显式检查点:
这就是 AI 助手和可审计研究智能体的区别。
| 阶段 | 命令 | Harness 强制执行什么 |
|---|---|---|
| 1 | /research-harness-setup | 初始化研究合同、文件夹结构和原始数据保护规则。 |
| 2 | /research-harness-audit | 只读审查原始数据:文件、变量、缺失值、ID、单位和可行性。 |
| 3 | /research-harness-clean | 编写并运行可复现的数据清洗、变量协调、reshape 和 merge 脚本。 |
| 4 | /research-harness-plan | 根据研究规格、数据审查和清洗后数据结构生成可执行分析计划。 |
| 5 | /research-harness-work | 执行已批准的分析任务,并在标记完成前核验脚本、日志和输出。 |
| 6 | /research-harness-review | 审查识别策略、模型设定、样本构造、数字一致性和因果表述。 |
| 7 | /research-harness-release | 打包复制档案:脚本、日志、输出、报告和复现说明。 |
你可以走完整生命周期,也可以只调用其中一个模块。
你不需要每次都运行完整流程。这个框架支持从中间阶段进入。