Skill

rfs-robustness

From rfs-skills

Builds the multi-test and out-of-sample robustness battery expected by RFS referees for empirical finance manuscripts.

automation

Popularity

Parent stars

342

Parent forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/rfs-skills:rfs-robustness

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

- The main result holds in one specification but you have not stress-tested it

SKILL.md

79 lines · ~1.3k tokens

Stats

LanguageStata

Parent stars342

Parent forks45

MaintenanceGood

Last CommitJun 8, 2026

Actions

View Source View Plugin View on GitHub View README

Robustness & Multiple-Testing Discipline (rfs-robustness)

When to trigger

The main result holds in one specification but you have not stress-tested it
A new return predictor / cross-sectional anomaly is the central claim
You tested many candidate variables and want to report the ones that "worked"
Reviewers will ask "does this survive [alternative spec / subsample / period]?"
Results may be sensitive to outliers, windsorization, or a single event

The two robustness mandates at RFS

RFS punishes fragile results and, in cross-sectional asset pricing, undisciplined multiple testing. Build the battery proactively — a result that only the authors can reproduce in one specification is treated as no result.

Two RFS-specific mechanisms raise the bar above JF/JFE:

Public code release is a condition of publication. Referees and post-publication readers can and do re-run your code. A robustness claim you cannot reproduce from the released code is a liability, not a footnote — design the battery so the released scripts regenerate every check.
Registered Reports neutralize the multiple-testing critique by construction. Because RFS offers pre-results review (the format it pioneered in finance), pre-specifying the hypothesis and the test before seeing outcomes is the strongest possible answer to "you data-mined this." Even outside the Registered Report track, pre-specification is the RFS-preferred defense. The q-factor spanning logic of Hou, Xue, and Zhang (2015) "Digesting Anomalies" (RFS 28(3)) is a model for confronting the anomaly zoo head-on.

A. General-fragility battery (all empirical papers)

Alternative specifications: different FE, control sets, functional forms — show the coefficient is stable.
Alternative measures: re-estimate with an alternative proxy for the key variable.
Subsamples: split by period, size, industry, region; the sign should not flip without explanation.
Outliers: re-run with alternative winsorization/trimming; drop influential observations.
Placebo / falsification: a setting where the effect should be zero.
Alternative clustering: show inference is not driven by an SE choice.

B. Multiple-testing discipline (cross-sectional asset pricing especially)

If the claim is a new predictor or anomaly, confront the data-mining critique head-on.
Report and discuss multiple-testing-adjusted significance (e.g., Bonferroni / Holm, FDR control, or the higher t-hurdles argued in the asset-pricing replication literature such as Harvey–Liu–Zhu).
Distinguish in-sample fit from out-of-sample performance; report OOS explicitly.
Pre-specify the hypothesis; do not present a survivor of a large specification search as if it were a single test.
For factor claims, run spanning tests against established factor models and report the alpha after controls.

C. Mechanism / external validity (supporting robustness)

Show the mechanism, not just the reduced-form effect — heterogeneity consistent with the proposed channel strengthens credibility.
Triangulate with a second data source or setting when feasible.

Sequencing the battery

Run the fragility checks before writing the main tables — a result that moves under reasonable perturbation is not ready.
For asset-pricing claims, treat the out-of-sample and multiple-testing checks as primary, not optional add-ons.
Decide early which single check per result earns a place in the main paper; the rest go to the Internet Appendix (rfs-internet-appendix).
If a check weakens the result, address it in the text — do not bury it or omit it; referees will re-run the obvious ones.

Checklist

Main coefficient shown stable across ≥3 alternative specifications
Alternative measure of the key variable tested
Subsample / period splits reported; sign stability explained
Outlier/winsorization sensitivity checked
Placebo or falsification test included
For asset-pricing claims: multiple-testing adjustment + out-of-sample test reported
Spanning tests against standard factor models (if a factor claim)
Every robustness check regenerable from the code RFS will require you to release publicly
Robustness tables sized for the Internet Appendix, not the main paper

Anti-patterns

Reporting the 3 predictors that worked out of 40 tested, with no adjustment.
Calling a result "robust" after one alternative control set.
In-sample-only predictability dressed as economically meaningful.
A subsample sign flip mentioned only in a footnote with no explanation.
Dumping 30 robustness tables into the main paper instead of the IA.

Output format

【Main result】coefficient / magnitude
【Fragility checks done】[specs, measures, subsamples, outliers, placebo]
【Multiple-testing】adjustment used + OOS result (if asset pricing)
【Surviving concerns】[...]
【Where reported】main paper vs. Internet Appendix
【Next step】rfs-tables-figures

rfs-robustness

Popularity

Invocation

Context Preview

SKILL.md

rfs-robustness

Popularity

Invocation

Context Preview

SKILL.md

Robustness & Multiple-Testing Discipline (rfs-robustness)

When to trigger

The two robustness mandates at RFS

A. General-fragility battery (all empirical papers)

B. Multiple-testing discipline (cross-sectional asset pricing especially)

C. Mechanism / external validity (supporting robustness)

Sequencing the battery

Checklist

Anti-patterns

Output format

Similar Skills

Robustness & Multiple-Testing Discipline (rfs-robustness)

When to trigger

The two robustness mandates at RFS

A. General-fragility battery (all empirical papers)

B. Multiple-testing discipline (cross-sectional asset pricing especially)

C. Mechanism / external validity (supporting robustness)

Sequencing the battery

Checklist

Anti-patterns

Output format

Similar Skills