Testing probabilistic programming systems

Saikat Dutta, Owolabi Legunsen, Zixin Huang, Sasa Misailovic

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Probabilistic programming systems (PP systems) allow developers to model stochastic phenomena and perform efficient inference on the models. The number and adoption of probabilistic programming systems is growing significantly. However, there is no prior study of bugs in these systems and no methodology for systematically testing PP systems. Yet, testing PP systems is highly non-trivial, especially when they perform approximate inference. In this paper, we characterize 118 previously reported bugs in three open-source PP systems-Edward, Pyro and Stan-and propose ProbFuzz, an extensible system for testing PP systems. Prob- Fuzz allows a developer to specify templates of probabilistic models, from which it generates concrete probabilistic programs and data for testing. ProbFuzz uses language-specific translators to generate these concrete programs, which use the APIs of each PP system. ProbFuzz finds potential bugs by checking the output from running the generated programs against several oracles, including an accuracy checker. Using ProbFuzz, we found 67 previously unknown bugs in recent versions of these PP systems. Developers already accepted 51 bug fixes that we submitted to the three PP systems, and their underlying systems, PyTorch and TensorFlow.

Original languageEnglish (US)
Title of host publicationESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European So ftware Engineering Conference and Symposium on the Foundations of So ftware Engineering
EditorsAlessandro Garci, Corina S. Pasareanu, Gary T. Leavens
PublisherAssociation for Computing Machinery, Inc
Pages574-586
Number of pages13
ISBN (Electronic)9781450355735
DOIs
StatePublished - Oct 26 2018
Externally publishedYes
Event26th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2018 - Lake Buena Vista, United States
Duration: Nov 4 2018Nov 9 2018

Publication series

NameESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Other

Other26th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2018
CountryUnited States
CityLake Buena Vista
Period11/4/1811/9/18

Fingerprint

Computer systems programming
Testing
Stochastic models
Application programming interfaces (API)

Keywords

  • Probabilistic programming languages
  • Software Testing

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Cite this

Dutta, S., Legunsen, O., Huang, Z., & Misailovic, S. (2018). Testing probabilistic programming systems. In A. Garci, C. S. Pasareanu, & G. T. Leavens (Eds.), ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European So ftware Engineering Conference and Symposium on the Foundations of So ftware Engineering (pp. 574-586). (ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering). Association for Computing Machinery, Inc. https://doi.org/10.1145/3236024.3236057

Testing probabilistic programming systems. / Dutta, Saikat; Legunsen, Owolabi; Huang, Zixin; Misailovic, Sasa.

ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European So ftware Engineering Conference and Symposium on the Foundations of So ftware Engineering. ed. / Alessandro Garci; Corina S. Pasareanu; Gary T. Leavens. Association for Computing Machinery, Inc, 2018. p. 574-586 (ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Dutta, S, Legunsen, O, Huang, Z & Misailovic, S 2018, Testing probabilistic programming systems. in A Garci, CS Pasareanu & GT Leavens (eds), ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European So ftware Engineering Conference and Symposium on the Foundations of So ftware Engineering. ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Association for Computing Machinery, Inc, pp. 574-586, 26th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2018, Lake Buena Vista, United States, 11/4/18. https://doi.org/10.1145/3236024.3236057
Dutta S, Legunsen O, Huang Z, Misailovic S. Testing probabilistic programming systems. In Garci A, Pasareanu CS, Leavens GT, editors, ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European So ftware Engineering Conference and Symposium on the Foundations of So ftware Engineering. Association for Computing Machinery, Inc. 2018. p. 574-586. (ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering). https://doi.org/10.1145/3236024.3236057
Dutta, Saikat ; Legunsen, Owolabi ; Huang, Zixin ; Misailovic, Sasa. / Testing probabilistic programming systems. ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European So ftware Engineering Conference and Symposium on the Foundations of So ftware Engineering. editor / Alessandro Garci ; Corina S. Pasareanu ; Gary T. Leavens. Association for Computing Machinery, Inc, 2018. pp. 574-586 (ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering).
@inproceedings{ec5c297953774f30a85a1e7880eb960e,
title = "Testing probabilistic programming systems",
abstract = "Probabilistic programming systems (PP systems) allow developers to model stochastic phenomena and perform efficient inference on the models. The number and adoption of probabilistic programming systems is growing significantly. However, there is no prior study of bugs in these systems and no methodology for systematically testing PP systems. Yet, testing PP systems is highly non-trivial, especially when they perform approximate inference. In this paper, we characterize 118 previously reported bugs in three open-source PP systems-Edward, Pyro and Stan-and propose ProbFuzz, an extensible system for testing PP systems. Prob- Fuzz allows a developer to specify templates of probabilistic models, from which it generates concrete probabilistic programs and data for testing. ProbFuzz uses language-specific translators to generate these concrete programs, which use the APIs of each PP system. ProbFuzz finds potential bugs by checking the output from running the generated programs against several oracles, including an accuracy checker. Using ProbFuzz, we found 67 previously unknown bugs in recent versions of these PP systems. Developers already accepted 51 bug fixes that we submitted to the three PP systems, and their underlying systems, PyTorch and TensorFlow.",
keywords = "Probabilistic programming languages, Software Testing",
author = "Saikat Dutta and Owolabi Legunsen and Zixin Huang and Sasa Misailovic",
year = "2018",
month = "10",
day = "26",
doi = "10.1145/3236024.3236057",
language = "English (US)",
series = "ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering",
publisher = "Association for Computing Machinery, Inc",
pages = "574--586",
editor = "Alessandro Garci and Pasareanu, {Corina S.} and Leavens, {Gary T.}",
booktitle = "ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European So ftware Engineering Conference and Symposium on the Foundations of So ftware Engineering",

}

TY - GEN

T1 - Testing probabilistic programming systems

AU - Dutta, Saikat

AU - Legunsen, Owolabi

AU - Huang, Zixin

AU - Misailovic, Sasa

PY - 2018/10/26

Y1 - 2018/10/26

N2 - Probabilistic programming systems (PP systems) allow developers to model stochastic phenomena and perform efficient inference on the models. The number and adoption of probabilistic programming systems is growing significantly. However, there is no prior study of bugs in these systems and no methodology for systematically testing PP systems. Yet, testing PP systems is highly non-trivial, especially when they perform approximate inference. In this paper, we characterize 118 previously reported bugs in three open-source PP systems-Edward, Pyro and Stan-and propose ProbFuzz, an extensible system for testing PP systems. Prob- Fuzz allows a developer to specify templates of probabilistic models, from which it generates concrete probabilistic programs and data for testing. ProbFuzz uses language-specific translators to generate these concrete programs, which use the APIs of each PP system. ProbFuzz finds potential bugs by checking the output from running the generated programs against several oracles, including an accuracy checker. Using ProbFuzz, we found 67 previously unknown bugs in recent versions of these PP systems. Developers already accepted 51 bug fixes that we submitted to the three PP systems, and their underlying systems, PyTorch and TensorFlow.

AB - Probabilistic programming systems (PP systems) allow developers to model stochastic phenomena and perform efficient inference on the models. The number and adoption of probabilistic programming systems is growing significantly. However, there is no prior study of bugs in these systems and no methodology for systematically testing PP systems. Yet, testing PP systems is highly non-trivial, especially when they perform approximate inference. In this paper, we characterize 118 previously reported bugs in three open-source PP systems-Edward, Pyro and Stan-and propose ProbFuzz, an extensible system for testing PP systems. Prob- Fuzz allows a developer to specify templates of probabilistic models, from which it generates concrete probabilistic programs and data for testing. ProbFuzz uses language-specific translators to generate these concrete programs, which use the APIs of each PP system. ProbFuzz finds potential bugs by checking the output from running the generated programs against several oracles, including an accuracy checker. Using ProbFuzz, we found 67 previously unknown bugs in recent versions of these PP systems. Developers already accepted 51 bug fixes that we submitted to the three PP systems, and their underlying systems, PyTorch and TensorFlow.

KW - Probabilistic programming languages

KW - Software Testing

UR - http://www.scopus.com/inward/record.url?scp=85058291625&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85058291625&partnerID=8YFLogxK

U2 - 10.1145/3236024.3236057

DO - 10.1145/3236024.3236057

M3 - Conference contribution

AN - SCOPUS:85058291625

T3 - ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

SP - 574

EP - 586

BT - ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European So ftware Engineering Conference and Symposium on the Foundations of So ftware Engineering

A2 - Garci, Alessandro

A2 - Pasareanu, Corina S.

A2 - Leavens, Gary T.

PB - Association for Computing Machinery, Inc

ER -