TY - GEN
T1 - Scientific tests and continuous integration strategies to enhance reproducibility in the scientific software context
AU - Krafczyk, Matthew
AU - Shi, August
AU - Bhaskar, Adhithya
AU - Marinov, Darko
AU - Stodden, Victoria
N1 - Funding Information:
This research was provided by NSF Awards CCF-1421503, CCF-1763788, and OAC-1839010. We thank NCSA and their SPIN program for their support. We also thank the anonymous reviewers who helped us to improve this manuscript.
Publisher Copyright:
© 2019 Association for Computing Machinery.
PY - 2019/6/17
Y1 - 2019/6/17
N2 - Continuous integration (CI) is a well-established technique in commercial and open-source software projects, although not routinely used in scientific publishing. In the scientific software context, CI can serve two functions to increase reproducibility of scientific results: providing an established platform for testing the reproducibility of these results, and demonstrating to other scientists how the code and data generate the published results. We explore scientific software testing and CI strategies using two articles published in the areas of applied mathematics and computational physics. We discuss lessons learned from reproducing these articles as well as examine and discuss existing tests. We introduce the notion of a scientific test as one that produces computational results from a published article. We then consider full result reproduction within a CI environment. If authors find their work too time or resource intensive to easily adapt to a CI context, we recommend the inclusion of results from reduced versions of their work (e.g., run at lower resolution, with shorter time scales, with smaller data sets) alongside their primary results within their article. While these smaller versions may be less interesting scientifically, they can serve to verify that published code and data are working properly. We demonstrate such reduction tests on the two articles studied.
AB - Continuous integration (CI) is a well-established technique in commercial and open-source software projects, although not routinely used in scientific publishing. In the scientific software context, CI can serve two functions to increase reproducibility of scientific results: providing an established platform for testing the reproducibility of these results, and demonstrating to other scientists how the code and data generate the published results. We explore scientific software testing and CI strategies using two articles published in the areas of applied mathematics and computational physics. We discuss lessons learned from reproducing these articles as well as examine and discuss existing tests. We introduce the notion of a scientific test as one that produces computational results from a published article. We then consider full result reproduction within a CI environment. If authors find their work too time or resource intensive to easily adapt to a CI context, we recommend the inclusion of results from reduced versions of their work (e.g., run at lower resolution, with shorter time scales, with smaller data sets) alongside their primary results within their article. While these smaller versions may be less interesting scientifically, they can serve to verify that published code and data are working properly. We demonstrate such reduction tests on the two articles studied.
KW - Continuous integration
KW - Reproducibility
KW - Scientific software
KW - Software reliability
KW - Software testing
UR - http://www.scopus.com/inward/record.url?scp=85069145767&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85069145767&partnerID=8YFLogxK
U2 - 10.1145/3322790.3330595
DO - 10.1145/3322790.3330595
M3 - Conference contribution
AN - SCOPUS:85069145767
T3 - P-RECS 2019 - Proceedings of the 2nd International Workshop on Practical Reproducible Evaluation of Computer Systems, co-located with HPDC 2019
SP - 23
EP - 28
BT - P-RECS 2019 - Proceedings of the 2nd International Workshop on Practical Reproducible Evaluation of Computer Systems, co-located with HPDC 2019
PB - Association for Computing Machinery
T2 - 2nd International Workshop on Practical Reproducible Evaluation of Computer Systems, P-RECS 2019, co-located with HPDC 2019
Y2 - 24 June 2019
ER -