TY - GEN
T1 - Testing probabilistic programming systems
AU - Dutta, Saikat
AU - Legunsen, Owolabi
AU - Huang, Zixin
AU - Misailovic, Sasa
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/10/26
Y1 - 2018/10/26
N2 - Probabilistic programming systems (PP systems) allow developers to model stochastic phenomena and perform efficient inference on the models. The number and adoption of probabilistic programming systems is growing significantly. However, there is no prior study of bugs in these systems and no methodology for systematically testing PP systems. Yet, testing PP systems is highly non-trivial, especially when they perform approximate inference. In this paper, we characterize 118 previously reported bugs in three open-source PP systems-Edward, Pyro and Stan-and propose ProbFuzz, an extensible system for testing PP systems. Prob- Fuzz allows a developer to specify templates of probabilistic models, from which it generates concrete probabilistic programs and data for testing. ProbFuzz uses language-specific translators to generate these concrete programs, which use the APIs of each PP system. ProbFuzz finds potential bugs by checking the output from running the generated programs against several oracles, including an accuracy checker. Using ProbFuzz, we found 67 previously unknown bugs in recent versions of these PP systems. Developers already accepted 51 bug fixes that we submitted to the three PP systems, and their underlying systems, PyTorch and TensorFlow.
AB - Probabilistic programming systems (PP systems) allow developers to model stochastic phenomena and perform efficient inference on the models. The number and adoption of probabilistic programming systems is growing significantly. However, there is no prior study of bugs in these systems and no methodology for systematically testing PP systems. Yet, testing PP systems is highly non-trivial, especially when they perform approximate inference. In this paper, we characterize 118 previously reported bugs in three open-source PP systems-Edward, Pyro and Stan-and propose ProbFuzz, an extensible system for testing PP systems. Prob- Fuzz allows a developer to specify templates of probabilistic models, from which it generates concrete probabilistic programs and data for testing. ProbFuzz uses language-specific translators to generate these concrete programs, which use the APIs of each PP system. ProbFuzz finds potential bugs by checking the output from running the generated programs against several oracles, including an accuracy checker. Using ProbFuzz, we found 67 previously unknown bugs in recent versions of these PP systems. Developers already accepted 51 bug fixes that we submitted to the three PP systems, and their underlying systems, PyTorch and TensorFlow.
KW - Probabilistic programming languages
KW - Software Testing
UR - http://www.scopus.com/inward/record.url?scp=85058291625&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85058291625&partnerID=8YFLogxK
U2 - 10.1145/3236024.3236057
DO - 10.1145/3236024.3236057
M3 - Conference contribution
AN - SCOPUS:85058291625
T3 - ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
SP - 574
EP - 586
BT - ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European So ftware Engineering Conference and Symposium on the Foundations of So ftware Engineering
A2 - Garci, Alessandro
A2 - Pasareanu, Corina S.
A2 - Leavens, Gary T.
PB - Association for Computing Machinery
T2 - 26th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2018
Y2 - 4 November 2018 through 9 November 2018
ER -