From jf-skills
Plans and audits the robustness, sensitivity, and multiple-testing battery for a Journal of Finance manuscript. Triage checks between the body and Internet Appendix.
How this skill is triggered — by the user, by Claude, or both
Slash command
/jf-skills:jf-robustnessThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
- The main result is in; you must decide which robustness checks to run and where to put them
JF favors an accessible body within the 60-page limit. Put the 3–6 decisive checks in the main text and move the exhaustive battery to the Internet Appendix, which is bundled at the end of the same PDF and does not count toward 60 pages (see jf-internet-appendix). This main-text/IA split is a JF hallmark; do not bury a load-bearing check.
JF published the canonical "factor zoo" critique (Harvey, Liu & Zhu, "…and the Cross-Section of Expected Returns," JF). Reviewers therefore expect:
jf-empirical-design)The hardest robustness decision at JF is not which checks to run but which earn a place in the lean body. Triage by how load-bearing each check is to the headline claim:
| Check | Lives in body if… | Otherwise → Internet Appendix |
|---|---|---|
| The single most threatening alternative explanation | A skeptic's first objection turns on it | never hide it in the IA |
| Multiple-testing-adjusted threshold (anomalies) | The discovery was mined from many candidates | full grid of signals → IA |
| Value-weighted / NYSE-breakpoint version | Microcap concentration is plausible | EW + alt breakpoints → IA |
| Alternative key-variable measure | The measure is contestable and pivotal | the other 4 measures → IA |
| Subsample / excluded-period | A specific event could drive the result | exhaustive subsamples → IA |
| Placebo / falsification | One clean falsification clinches credibility | the rest of the battery → IA |
The cultural signal at JF: 3–6 decisive checks plus a deep Internet Appendix reads as confident; twenty robustness tables in the body read as defensive.
Illustrative numbers. An anomaly paper reports a long-short spread of 0.58%/month, raw t = 3.2, found after screening (honestly disclosed) ~40 candidate signals. JF's published "factor zoo" lens (Harvey, Liu & Zhu) means t = 3.2 is not automatically decisive:
The editor sees a robust effect, a transparent search, and a magnitude that survives the multiple-testing haircut.
| Pushback you will hear | JF-specific fix |
|---|---|
| "How many specifications did you try?" | State the count; report an FDR-/Bonferroni-adjusted threshold |
| "This is a microcap effect" | Value-weighted, NYSE-breakpoint version in the body |
| "You buried the failing robustness check" | Surface the load-bearing check in the body, not the appendix |
【Decisive checks in body】[3–6]
【Specifications tried disclosed?】yes / no
【Multiple-testing adjustment?】yes / no — method
【Placebo/falsification present?】yes / no
【Body ≤60 pp after split?】yes / no
【Next step】jf-tables-figures
npx claudepluginhub brycewang-stanford/awesome-journal-skills --plugin jf-skillsBuilds an exhaustive robustness battery for financial-economics results — alternative measures, specifications, samples, inference, falsification, and rival explanations. Decides which tests go to main text vs. Internet Appendix.
Builds the multi-test and out-of-sample robustness battery expected by RFS referees for empirical finance manuscripts.
Builds an exhaustive robustness suite and online appendix for QJE manuscripts, prioritizing checks by identification threat, measurement, and specification.