JAR Data and Code Sharing Policy
This template is designed to satisfy the Journal of Accounting Research’s Data and Code Sharing Policy (https://onlinelibrary.wiley.com/page/journal/1475679x/homepage/forauthors.html#DataPolicy). The policy expects authors to provide three things:
- Code that converts raw data into the final analytical dataset and produces the reported tables and figures.
- A comprehensive log file documenting the end-to-end execution of that code.
- Identifiers (e.g.,
gvkey,permno) of the observations comprising the final sample.
project-template is designed around these requirements:
- The pipeline splits raw WRDS pulls (
RAW_DATA_DIR) from derived data (DATA_DIR). A replication run can re-execute scripts 2-4 against the original researcher’s preserved raw inputs without hitting WRDS. - Every pipeline step produces a per-script log in the SAS-log style — every command echoed, output interleaved, plain text. R steps go through
batch_run()(anR CMD BATCHwrapper inutils.R); Python steps go through an equivalentbatch_run()inutils.pythat subprocesses through an AST-based echo wrapper; Stata’s nativelog usingand SAS’s nativeproc printtoproduce the same shape. All four supported languages emit visually consistent logs. - The
005-data-provenance.{R,py}step exportssample-identifiers.{parquet,csv}(gvkey, permno, rdq, datadate, fyearq, fqtr) and prints SHA256 hashes for every raw, derived, and output file. That step’s own.Rout/.logis the project’s content-addressed provenance record.