TY - GEN
T1 - Preempting Flaky Tests via Non-Idempotent-Outcome Tests
AU - Wei, Anjiang
AU - Yi, Pu
AU - Li, Zhengxi
AU - Xie, Tao
AU - Marinov, Darko
AU - Lam, Wing
N1 - Funding Information:
We thank Yang Chen, Ruixin Wang, Satvik Eltepu, Reed Oei, and Jonathan Stein for their help. This work was partially supported by US NSF grants CCF-1763788 and CCF-1956374, and by Natural Science Foundation of China (Grant No. 62161146003), and the XPLORER PRIZE. Tao Xie (the corresponding author) is also with the Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education, China. We acknowledge support for research on flaky tests from Facebook and Google.
Publisher Copyright:
© 2022 ACM.
PY - 2022
Y1 - 2022
N2 - Regression testing can greatly help in software development, but it can be seriously undermined by flaky tests, which can both pass and fail, seemingly nondeterministically, on the same code commit. Flaky tests are an emerging topic in both research and industry. Prior work has identified multiple categories of flaky tests, developed techniques for detecting these flaky tests, and analyzed some detected flaky tests. To proactively detect, i.e., preempt, flaky tests, we propose to detect non-idempotent-outcome (NIO) tests, a novel category related to flaky tests. In particular, we run each test twice in the same test execution environment, e.g., run each Java test twice in the same Java Virtual Machine. A test is NIO if it passes in the first run but fails in the second. Each NIO test has side effects and 'self-pollutes' the state shared among test runs. We perform experiments on both Java and Python open-source projects, detecting 223 NIO Java tests and 138 NIO Python tests. We have inspected all 361 detected tests and opened pull requests that fix 268 tests, with 192 already accepted, only 6 rejected, and the remaining 70 pending.
AB - Regression testing can greatly help in software development, but it can be seriously undermined by flaky tests, which can both pass and fail, seemingly nondeterministically, on the same code commit. Flaky tests are an emerging topic in both research and industry. Prior work has identified multiple categories of flaky tests, developed techniques for detecting these flaky tests, and analyzed some detected flaky tests. To proactively detect, i.e., preempt, flaky tests, we propose to detect non-idempotent-outcome (NIO) tests, a novel category related to flaky tests. In particular, we run each test twice in the same test execution environment, e.g., run each Java test twice in the same Java Virtual Machine. A test is NIO if it passes in the first run but fails in the second. Each NIO test has side effects and 'self-pollutes' the state shared among test runs. We perform experiments on both Java and Python open-source projects, detecting 223 NIO Java tests and 138 NIO Python tests. We have inspected all 361 detected tests and opened pull requests that fix 268 tests, with 192 already accepted, only 6 rejected, and the remaining 70 pending.
UR - http://www.scopus.com/inward/record.url?scp=85133537509&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85133537509&partnerID=8YFLogxK
U2 - 10.1145/3510003.3510170
DO - 10.1145/3510003.3510170
M3 - Conference contribution
AN - SCOPUS:85133537509
T3 - Proceedings - International Conference on Software Engineering
SP - 1730
EP - 1742
BT - Proceedings - 2022 ACM/IEEE 44th International Conference on Software Engineering, ICSE 2022
PB - IEEE Computer Society
T2 - 44th ACM/IEEE International Conference on Software Engineering, ICSE 2022
Y2 - 22 May 2022 through 27 May 2022
ER -