From ai4ss-skills
Audits social-science empirical work: research design, scripts, regression output, tables, figures, robustness checks, and reproducibility evidence. Use before submission or reviewer response.
How this skill is triggered — by the user, by Claude, or both
Slash command
/ai4ss-skills:methods-reviewerThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Audit empirical work before it becomes a paper, talk, or response. The default output is an issue table with evidence, severity, and concrete next actions.
Audit empirical work before it becomes a paper, talk, or response. The default output is an issue table with evidence, severity, and concrete next actions.
This skill answers: "结果解释有没有说过头?" Its value is not replacing methodological judgment; it is exposing whether the script, model object, table, figure, and written claim actually support the same interpretation.
Review first, edit second. User permission may authorize focused code, table, or figure fixes after the review, but manuscript claims must remain claim-ledger rows, risk labels, and author revision targets. Do not provide replacement manuscript wording.
This skill is the Diagnose and Redesign layer of the MIDA spine. It checks whether declared Model, Inquiry, Data strategy, Answer strategy, executed outputs, and public claims still refer to the same research design.
The review must name diagnosands or gates such as wrong estimand, weak comparison, measurement mismatch, inference mismatch, reproducibility failure, source-status risk, or overclaiming. Redesign recommendations remain author decisions unless the user explicitly asks for implementation.
When a .aiss model is present, reviewing the model is part of the methods audit: aiss.py compile/lint/run errors, missing bridges, unchecked commensurability, and model-to-output mismatch are reportable issues. When a theory workbench is present, rival explanations, scope drift, vague mechanisms, non-discriminating observable implications, weak source status, and theory overclaim are issue-table rows, not a separate theory-review schema.
study_design_brief.md, study_design_declaration.csv, research_model.aiss, ai4ss_check_report.txt, analysis_run_manifest.csv, scripts, logs, tables, figures, data audit outputs, literature matrices, literature_theory_synthesis.csv, theory_rival_map.csv, theory_scope_map.csv, manuscript snippets, or reviewer comments.route_id, design_source, target_inquiry, mida_component, analysis_outputs, issue_table, severity, evidence, next_action, author_decisions, ai4ss_model_path, model_id, concept_id, causal_id, bridge_id, ai4ss_check_status, commensurability_status, next_skill_route.research-data-builder, research-analysis-runner, study-design-builder, academic-writing-scaffold, reviewer-response, research-slides-builder, did-expert, or ask_author.Use this skill to audit evidence for identification validity, result-claim fit, robustness, inference, and reproducibility. Final scholarly judgment remains with the author. Do not use it to build data pipelines; hand data construction to research-data-builder. Do not use it as the first executor of an analysis plan; hand execution to research-analysis-runner. Do not use it to write manuscript prose or response letters; hand evidence-ready scaffolds to academic-writing-scaffold or reviewer-response.
Step -1: Orient
-> Read AGENTS.md, research design notes, scripts, logs, tables, figures, and manuscript/output text.
-> Identify the design family: descriptive, OLS, DID, IV, RD, RCT, synthetic control, panel, qualitative, mixed methods.
-> For DID/event-study as the central task, invoke $did-expert first when available; use this skill to wrap its findings into the general issue table.
-> If `research_model.aiss` is present, run or inspect `scripts/validate_ai4ss_model.py` output before judging claims.
Step 0: Build audit scope
-> Data construction, model specification, inference, diagnostics, robustness, reporting, reproducibility, writing claims.
Step 1: Inspect evidence
-> Compare stated design against actual scripts and outputs.
-> Check whether tables/figures expose sample, variables, FE, clustering, and uncertainty.
-> Trace suspicious numbers back to scripts or logs.
-> Compare `.aiss` concepts, causal implications, and bridges against design declarations, data audits, and analysis manifests.
-> If a theory workbench is present, audit rival explanations, scope rows, mechanism parts, source-status support, and observable implications against the declared design.
Step 2: Produce issue table
-> Use severity and confidence.
-> Separate confirmed bugs from risks, missing evidence, and author decisions.
-> Give exact file paths and lines where possible.
Step 3: Recommend next actions
-> Suggest minimal checks or analyses.
-> Do not invent robustness results.
-> If implementation is requested, make focused changes and rerun relevant commands.
Return findings first:
| severity | issue | evidence | why_it_matters | next_action | status |
|---|
Then include open author decisions and any test commands run. If a .aiss
model is in scope, include model identifiers and check status in the CSV sidecar.
If a Markdown issue table is shown to the user, keep a CSV sidecar with these exact snake_case columns for validation.
research-analysis-runner/scripts/check_runtime_contract.py --cwd <project> ... or inspect its JSON report before reviewing result claims.scripts/validate_issue_table.py <path> to check the issue-table schema and severity labels.scripts/validate_ai4ss_model.py <path-to-research_model.aiss> when a model-linked issue is in scope.| File | Content | Read when |
|---|---|---|
| audit-checklist.md | General empirical audit checklist across data, design, inference, outputs, and claims | Running a review |
| reporting-standards.md | What tables, figures, and result text must expose | Reviewing outputs or presentation materials |
| design-routes.md | Review routes for OLS/panel, DID, IV, RD, synthetic control, descriptive, and qualitative designs | Choosing the right audit path |
| prompt-pack.md | Copy-ready prompts for script review, table audit, result-claim audit, and reproducibility checks | Turning a review need into an agent task |
| issue-examples.md | Example findings, severities, evidence standards, and false-positive controls | Calibrating review output |
npx claudepluginhub siyaozheng/ai4ss-skills --plugin ai4ss-skillsRead-only review of research outputs checking identification, numerical accuracy, causal claims, and reproducibility. Produces a structured report with APPROVE/REQUEST_CHANGES/BLOCK verdict.
Performs structured peer review of research methodology, experimental design, and manuscript quality. Use for manuscripts, preprints, proposals, or thesis chapters.
Reviews data analysis methodology (Phase 4 of the /ds workflow). Runs automated guards to validate subagent outputs and gates further agent actions until validation passes.