Help us improve
Share bugs, ideas, or general feedback.
From claudecode-research-harness-workflow
Read-only review of research outputs checking identification, numerical accuracy, causal claims, and reproducibility. Produces a structured report with APPROVE/REQUEST_CHANGES/BLOCK verdict.
npx claudepluginhub maxwell2732/claudecode-research-harness-workflow --plugin claudecode-research-harness-workflowHow this skill is triggered — by the user, by Claude, or both
Slash command
/claudecode-research-harness-workflow:research-harness-review [--quick] [--task TASK-ID][--quick] [--task TASK-ID]This skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Perform an independent, read-only review of research outputs before release.
Guides technical evaluation of code review feedback: read fully, restate for understanding, verify against codebase, respond with reasoning or pushback before implementing.
Share bugs, ideas, or general feedback.
Perform an independent, read-only review of research outputs before release.
This skill reads existing scripts, logs, and outputs. It does not run code. It does not edit scripts or data. It produces a structured review report with a verdict.
This skill runs after /research-harness-work and before /research-harness-release.
| Input | Action |
|---|---|
/research-harness-review | Full review of all cc:done tasks in analysis_plan.md |
/research-harness-review --quick | Abbreviated review: identification + numerical accuracy only |
/research-harness-review --task 2.1 | Review a single task |
analysis_plan.md. If it does not exist, stop.cc:done. If none, report that no completed tasks exist to review.cc:done tasks. These are the review scope.study_spec.md, reports/data_audit_report.md, reports/data_cleaning_report.md, and reports/merge_report.md (if it exists).This skill is read-only throughout. No Bash commands, no script execution, no file writes except the review report.
Read study_spec.md §2 (identification strategy).
For each main model task in analysis_plan.md, check:
Assign one of: strong / moderate / weak / insufficient
insufficient identification immediately produces REQUEST_CHANGES. Do not write findings as minor if the identification is insufficient for the causal claim being made.
For each cc:done analysis task:
study_spec.md §4 and §5minor (changes that do not affect the main result), major (changes that affect the result), or critical (changes that contradict the approved study design)For each cc:done task with an output file:
unverifiedAn unverified number is a critical finding if it will appear in the final reported results.
Do not verify numbers by re-running scripts. Only read existing logs and outputs.
minor (small difference with a plausible explanation), major (large unexplained difference), or critical (N is clearly wrong)Read any interim outputs, table notes, or text summaries (if present in output/ or reports/).
For every causal claim found:
[descriptive], [correlational], [quasi-experimental: ...], [experimental])major findingsDo not rewrite weak evidence as strong causal evidence. If the evidence is correlational, the claim must be correlational.
reports/data_cleaning_report.md exists and its verification section is PASSreports/merge_report.md exists (if merges were performed) and all entries have pre/post row countsdata/raw/ was not modified (if git status data/raw/ or equivalent is available from prior logs, read it)Copy templates/review_report.md to reports/review_report.md and fill in all sections from Steps 1–7.
Verdict rules:
| Condition | Verdict |
|---|---|
| No critical or major findings | APPROVE |
| One or more major findings (but no critical) | REQUEST_CHANGES |
| One or more critical findings | BLOCK |
Identification is insufficient for the causal claims made | BLOCK |
| Any result cannot be traced to a log | BLOCK |
BLOCK is a stronger form of REQUEST_CHANGES. It means the research cannot be released in any form until the finding is resolved.
Print a review summary:
Research Harness Review — Complete
Tasks reviewed: N
Scope: [task IDs]
Identification credibility: strong / moderate / weak / insufficient
Numerical accuracy: all verified / N unverified
Causal claims: all appropriate / N overstated
Critical findings: N
Major findings: N
Minor findings: N
Verdict: APPROVE / REQUEST_CHANGES / BLOCK
Review report: reports/review_report.md
If verdict is REQUEST_CHANGES or BLOCK:
Required actions before /research-harness-release:
1. [Finding 1 — required action]
2. [Finding 2 — required action]
Return to /research-harness-work to re-execute affected tasks, then re-run /research-harness-review.
APPROVE when any critical finding existsreports/review_report.md exists with all sections populatedAPPROVE, REQUEST_CHANGES, BLOCKcc:done tasks reviewedreports/review_report.md writtenIf verdict is APPROVE:
Review passed. Run
/research-harness-releaseto package the replication archive.
If verdict is REQUEST_CHANGES or BLOCK:
Return to
/research-harness-workto resolve the findings listed inreports/review_report.md. After re-executing affected tasks, run/research-harness-reviewagain. Do not run/research-harness-releaseuntil the verdict isAPPROVE.