Mitigating the effects of flaky tests on mutation testing

August Shi, Jonathan Bell, Darko Marinov

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Mutation testing is widely used in research as a metric for evaluating the quality of test suites. Mutation testing runs the test suite on generated mutants (variants of the code under test) where a test suite kills a mutant if any of the tests fail when run on the mutant. Mutation testing implicitly assumes that tests exhibit deterministic behavior, in terms of their coverage and the outcome of a test (not) killing a certain mutant. Such an assumption does not hold in the presence of flaky tests, whose outcomes can non-deterministically differ even when run on the same code under test. Without reliable test outcomes, mutation testing can result in unreliable results, e.g., in our experiments, mutation scores vary by four percentage points on average between repeated executions, and 9% of mutant-test pairs have an unknown status. Many modern software projects suffer from flaky tests. We propose techniques that manage flakiness throughout the mutation testing process, largely based on strategically re-running tests. We implement our techniques by modifying the open-source mutation testing tool, PIT. Our evaluation on 30 projects shows that our techniques reduce the number of łunknownž (flaky) mutants by 79.4%.

Original languageEnglish (US)
Title of host publicationISSTA 2019 - Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis
EditorsDongmei Zhang, Anders Moller
PublisherAssociation for Computing Machinery, Inc
Pages296-306
Number of pages11
ISBN (Electronic)9781450362245
DOIs
StatePublished - Jul 10 2019
Event28th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2019 - Beijing, China
Duration: Jul 15 2019Jul 19 2019

Publication series

NameISSTA 2019 - Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

Conference

Conference28th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2019
CountryChina
CityBeijing
Period7/15/197/19/19

Fingerprint

Testing
Experiments

Keywords

  • Flaky tests
  • Mutation testing
  • Non-deterministic coverage

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Software

Cite this

Shi, A., Bell, J., & Marinov, D. (2019). Mitigating the effects of flaky tests on mutation testing. In D. Zhang, & A. Moller (Eds.), ISSTA 2019 - Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (pp. 296-306). (ISSTA 2019 - Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis). Association for Computing Machinery, Inc. https://doi.org/10.1145/3293882.3330568

Mitigating the effects of flaky tests on mutation testing. / Shi, August; Bell, Jonathan; Marinov, Darko.

ISSTA 2019 - Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. ed. / Dongmei Zhang; Anders Moller. Association for Computing Machinery, Inc, 2019. p. 296-306 (ISSTA 2019 - Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Shi, A, Bell, J & Marinov, D 2019, Mitigating the effects of flaky tests on mutation testing. in D Zhang & A Moller (eds), ISSTA 2019 - Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. ISSTA 2019 - Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, Association for Computing Machinery, Inc, pp. 296-306, 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2019, Beijing, China, 7/15/19. https://doi.org/10.1145/3293882.3330568
Shi A, Bell J, Marinov D. Mitigating the effects of flaky tests on mutation testing. In Zhang D, Moller A, editors, ISSTA 2019 - Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. Association for Computing Machinery, Inc. 2019. p. 296-306. (ISSTA 2019 - Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis). https://doi.org/10.1145/3293882.3330568
Shi, August ; Bell, Jonathan ; Marinov, Darko. / Mitigating the effects of flaky tests on mutation testing. ISSTA 2019 - Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. editor / Dongmei Zhang ; Anders Moller. Association for Computing Machinery, Inc, 2019. pp. 296-306 (ISSTA 2019 - Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis).
@inproceedings{e23be9c626e646f588d3c018f4134333,
title = "Mitigating the effects of flaky tests on mutation testing",
abstract = "Mutation testing is widely used in research as a metric for evaluating the quality of test suites. Mutation testing runs the test suite on generated mutants (variants of the code under test) where a test suite kills a mutant if any of the tests fail when run on the mutant. Mutation testing implicitly assumes that tests exhibit deterministic behavior, in terms of their coverage and the outcome of a test (not) killing a certain mutant. Such an assumption does not hold in the presence of flaky tests, whose outcomes can non-deterministically differ even when run on the same code under test. Without reliable test outcomes, mutation testing can result in unreliable results, e.g., in our experiments, mutation scores vary by four percentage points on average between repeated executions, and 9{\%} of mutant-test pairs have an unknown status. Many modern software projects suffer from flaky tests. We propose techniques that manage flakiness throughout the mutation testing process, largely based on strategically re-running tests. We implement our techniques by modifying the open-source mutation testing tool, PIT. Our evaluation on 30 projects shows that our techniques reduce the number of łunknownž (flaky) mutants by 79.4{\%}.",
keywords = "Flaky tests, Mutation testing, Non-deterministic coverage",
author = "August Shi and Jonathan Bell and Darko Marinov",
year = "2019",
month = "7",
day = "10",
doi = "10.1145/3293882.3330568",
language = "English (US)",
series = "ISSTA 2019 - Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis",
publisher = "Association for Computing Machinery, Inc",
pages = "296--306",
editor = "Dongmei Zhang and Anders Moller",
booktitle = "ISSTA 2019 - Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis",

}

TY - GEN

T1 - Mitigating the effects of flaky tests on mutation testing

AU - Shi, August

AU - Bell, Jonathan

AU - Marinov, Darko

PY - 2019/7/10

Y1 - 2019/7/10

N2 - Mutation testing is widely used in research as a metric for evaluating the quality of test suites. Mutation testing runs the test suite on generated mutants (variants of the code under test) where a test suite kills a mutant if any of the tests fail when run on the mutant. Mutation testing implicitly assumes that tests exhibit deterministic behavior, in terms of their coverage and the outcome of a test (not) killing a certain mutant. Such an assumption does not hold in the presence of flaky tests, whose outcomes can non-deterministically differ even when run on the same code under test. Without reliable test outcomes, mutation testing can result in unreliable results, e.g., in our experiments, mutation scores vary by four percentage points on average between repeated executions, and 9% of mutant-test pairs have an unknown status. Many modern software projects suffer from flaky tests. We propose techniques that manage flakiness throughout the mutation testing process, largely based on strategically re-running tests. We implement our techniques by modifying the open-source mutation testing tool, PIT. Our evaluation on 30 projects shows that our techniques reduce the number of łunknownž (flaky) mutants by 79.4%.

AB - Mutation testing is widely used in research as a metric for evaluating the quality of test suites. Mutation testing runs the test suite on generated mutants (variants of the code under test) where a test suite kills a mutant if any of the tests fail when run on the mutant. Mutation testing implicitly assumes that tests exhibit deterministic behavior, in terms of their coverage and the outcome of a test (not) killing a certain mutant. Such an assumption does not hold in the presence of flaky tests, whose outcomes can non-deterministically differ even when run on the same code under test. Without reliable test outcomes, mutation testing can result in unreliable results, e.g., in our experiments, mutation scores vary by four percentage points on average between repeated executions, and 9% of mutant-test pairs have an unknown status. Many modern software projects suffer from flaky tests. We propose techniques that manage flakiness throughout the mutation testing process, largely based on strategically re-running tests. We implement our techniques by modifying the open-source mutation testing tool, PIT. Our evaluation on 30 projects shows that our techniques reduce the number of łunknownž (flaky) mutants by 79.4%.

KW - Flaky tests

KW - Mutation testing

KW - Non-deterministic coverage

UR - http://www.scopus.com/inward/record.url?scp=85070631430&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85070631430&partnerID=8YFLogxK

U2 - 10.1145/3293882.3330568

DO - 10.1145/3293882.3330568

M3 - Conference contribution

AN - SCOPUS:85070631430

T3 - ISSTA 2019 - Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

SP - 296

EP - 306

BT - ISSTA 2019 - Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

A2 - Zhang, Dongmei

A2 - Moller, Anders

PB - Association for Computing Machinery, Inc

ER -