TY - GEN
T1 - Enabling the verification of computational results
AU - Stodden, Victoria
AU - Krafczyk, Matthew S.
AU - Bhaskar, Adhithya
N1 - Funding Information:
The authors would like to thank David Wong, Yantong Zhang, and Alex Dickinson for outstanding research assistance. We also acknowledge support from NSF Award ACI-1659702 and the NCSA SPIN program.
Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/6/11
Y1 - 2018/6/11
N2 - The ability to independently regenerate published computational claims is widely recognized as a key component of scientific reproducibility. In this article we take a narrow interpretation of this goal, and attempt to regenerate published claims from author-supplied information, including data, code, inputs, and other provided specifications, on a different computational system than that used by the original authors. We are motivated by Claerbout and Donoho’s exhortation of the importance of providing complete information for reproducibility of the published claim. We chose the Elsevier journal, the Journal of Computational Physics, which has stated author guidelines that encourage the availability of computational digital artifacts that support scholarly findings. In an IRB approved study at the University of Illinois at Urbana-Champaign (IRB #17329) we gathered artifacts from a sample of authors who published in this journal in 2016 and 2017. We then used the ICERM criteria generated at the 2012 ICERM workshop “Reproducibility in Computational and Experimental Mathematics” to evaluate the sufficiency of the information provided in the publications and the ease with which the digital artifacts afforded computational reproducibility. We find that, for the articles for which we obtained computational artifacts, we could not easily regenerate the findings for 67% of them, and we were unable to easily regenerate all the findings for any of the articles. We then evaluated the artifacts we did obtain (55 of 306 articles) and find that the main barriers to computational reproducibility are inadequate documentation of code, data, and workflow information (70.9%), missing code function and setting information, and missing licensing information (75%). We recommend improvements based on these findings, including the deposit of supporting digital artifacts for reproducibility as a condition of publication, and verification of computational findings via re-execution of the code when possible.
AB - The ability to independently regenerate published computational claims is widely recognized as a key component of scientific reproducibility. In this article we take a narrow interpretation of this goal, and attempt to regenerate published claims from author-supplied information, including data, code, inputs, and other provided specifications, on a different computational system than that used by the original authors. We are motivated by Claerbout and Donoho’s exhortation of the importance of providing complete information for reproducibility of the published claim. We chose the Elsevier journal, the Journal of Computational Physics, which has stated author guidelines that encourage the availability of computational digital artifacts that support scholarly findings. In an IRB approved study at the University of Illinois at Urbana-Champaign (IRB #17329) we gathered artifacts from a sample of authors who published in this journal in 2016 and 2017. We then used the ICERM criteria generated at the 2012 ICERM workshop “Reproducibility in Computational and Experimental Mathematics” to evaluate the sufficiency of the information provided in the publications and the ease with which the digital artifacts afforded computational reproducibility. We find that, for the articles for which we obtained computational artifacts, we could not easily regenerate the findings for 67% of them, and we were unable to easily regenerate all the findings for any of the articles. We then evaluated the artifacts we did obtain (55 of 306 articles) and find that the main barriers to computational reproducibility are inadequate documentation of code, data, and workflow information (70.9%), missing code function and setting information, and missing licensing information (75%). We recommend improvements based on these findings, including the deposit of supporting digital artifacts for reproducibility as a condition of publication, and verification of computational findings via re-execution of the code when possible.
KW - Code access
KW - Data access
KW - Provenance
KW - Reproducibility policy
KW - Reproducible research
KW - Workflows
UR - http://www.scopus.com/inward/record.url?scp=85050081067&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85050081067&partnerID=8YFLogxK
U2 - 10.1145/3214239.3214242
DO - 10.1145/3214239.3214242
M3 - Conference contribution
AN - SCOPUS:85050081067
T3 - Proceedings of the 1st International Workshop on Practical Reproducible Evaluation of Computer Systems, P-RECS 2018
BT - Proceedings of the 1st International Workshop on Practical Reproducible Evaluation of Computer Systems, P-RECS 2018
PB - Association for Computing Machinery
T2 - 1st International Workshop on Practical Reproducible Evaluation of Computer Systems, P-RECS 2018
Y2 - 11 June 2018
ER -