Builds a data and code replication package for an Economic Journal manuscript to RES/EJ Data Editor standard, including README, data documentation, and Zenodo deposit preparation. Does not run the analysis.
The paper is heading toward acceptance and the EJ Data Editor needs a reproducible deposit
You want the package to pass the EJ Data Editor's reproducibility check on the first pass
Some data are proprietary or restricted and you must request an exemption and document access
You are setting up the project early so reproducibility is not a last-minute scramble
Verify the current policy on the EJ Data Editor site (ejdataeditor.github.io) and the OUP
Instructions before depositing. EJ runs pre-acceptance reproducibility checks: the paper is
accepted for final publication only after results have been checked for reproducibility. The
package is posted to the journal's Zenodo repository or another trusted repository and linked from
the paper. It is essential to request a data exemption at the point of first submission if you
face any access restrictions.
What a passing package contains
README (the centerpiece) following the DCAS / Social Science Data Editors README template:
Overview of what the code does and the mapping from code → every exhibit and in-text number.
Data availability statement: source, terms, whether each dataset is public / restricted / proprietary, and exact access steps (registrations, memberships, monetary and time costs). State clearly if data cannot be shared and why, referencing the exemption requested at first submission.
Computational requirements: software + versions, packages + versions, OS, memory, and approximate run time.
Instructions to run: a single master script ordering everything end to end.
List of every table/figure/in-text number with the script and line that produces it.
Data: raw inputs (when license permits) and the code that builds analysis files from them. Provide complete documentation of all variables; if data are in a proprietary format (e.g., Stata .dta), also provide an ASCII/plain-text copy such as .csv. If raw data are restricted, include construction code plus a synthetic/simulated dataset that lets the pipeline run.
Code: a master script that reproduces every number, table, and figure from raw inputs, with relative paths and fixed seeds.
Output: log files and generated exhibits, so the editor can diff against the paper.
Reproducibility discipline
One master script; no manual steps, no hard-coded absolute paths, no "run cell 4 then cell 2."
Set and record random seeds for any simulation, bootstrap, or ML step.
Pin software and package versions; record them in the README and, where possible, in a lockfile/environment file.
Every exhibit and in-text number in the paper is regenerated by the code — no hand-edited tables.
Restricted / proprietary data (the EJ exemption route)
Request the exemption at first submission, not at acceptance — the EJ Data Editor stresses this timing.
You may not need to deposit the data, but you must deposit the code and a precise access path so a third party with the same license can reproduce results.
Provide a Data Availability Statement and, where feasible, a small simulated dataset matching the schema so the pipeline is executable.
Confidential-data results may require a verification arrangement with the EJ Data Editor; document it.
Checklist
README follows the DCAS template (overview, data availability, requirements, run instructions, exhibit map)
Deposit goes to the journal's Zenodo repository (or another trusted repository) with a license allowing replication
Package layout matches EJ guidance: 1-paper, 2-appendices, README.pdf, 3-replication-package.zip, and optional 4-confidential-data-not-for-publication.zip
Single master script reproduces every table, figure, and in-text number from inputs
Software and package versions pinned and recorded
Random seeds set and documented
Relative paths only; runs on a clean machine in a fresh directory
All variables documented; proprietary-format data also provided as ASCII/plain text
Data availability statement covers each dataset (public / restricted / proprietary) with access steps and costs
Restricted data: exemption requested at first submission
Package re-run from scratch and output diffed against the paper, ready for the EJ Data Editor
Current EJ/RES data policy (DCAS, Zenodo, EJ Data Editor) verified on the official pages
Anti-patterns
A zip of scripts with no README and no code → exhibit mapping
Absolute paths (/Users/me/...) that break on any other machine
Unset seeds so bootstrap/simulation numbers do not reproduce
"Data available on request" with no construction code and no access detail
Requesting a restricted-data exemption only at acceptance instead of at first submission
Proprietary-only data with no ASCII/plain-text companion and no variable documentation
Hand-edited tables that the code does not actually generate
Submitting without re-running the package on a clean environment
Output format
【Policy verified】EJ/RES data policy (DCAS, Zenodo, EJ Data Editor) checked on official pages [y/n]
【README】DCAS template sections present? [y/n each]
【Deposit】Zenodo (or trusted repo) + replication license attached? [y/n]
【Master script】reproduces all exhibits + in-text numbers from raw? [y/n]
【Versions + seeds】pinned/documented? [y/n]
【Data status】public / restricted (exemption at first submission) + access path; ASCII companion? [y/n]
【Clean-machine test】passed, ready for EJ Data Editor? [y/n]
【Next】ecj-submission
Assembles the data and code deposit for an accepted REStud manuscript, writes the README, and audits reproducibility before the journal's Data Editor runs the pre-publication check.
Assembles a data and code replication package for a JPE manuscript to DCAS/JPE Data Editor standards, including README, data, code, and output, for deposit to the JPE Dataverse.
Assembles the data-and-code replication package for a REStat manuscript, including deposit to the REStat Harvard Dataverse with a replication-permitting README.