TY - GEN
T1 - Evaluating test-suite reduction in real software evolution
AU - Shi, August
AU - Gyori, Alex
AU - Mahmood, Suleman
AU - Zhao, Peiyuan
AU - Marinov, Darko
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/7/12
Y1 - 2018/7/12
N2 - Test-suite reduction (TSR) speeds up regression testing by removing redundant tests from the test suite, thus running fewer tests in the future builds. To decide whether to use TSR or not, a developer needs some way to predict howwell the reduced test suite will detect real faults in the future compared to the original test suite. Prior research evaluated the cost of TSR using only program versions with seeded faults, but such evaluations do not explicitly predict the effectiveness of the reduced test suite in future builds. We perform the first extensive study of TSR using real test failures in (failed) builds that occurred for real code changes. We analyze 1478 failed builds from 32 GitHub projects that run their tests on Travis. Each failed build can have multiple faults, so we propose a family of mappings from test failures to faults. We use these mappings to compute Failed-Build Detection Loss (FBDL), the percentage of failed builds where the reduced test suite misses to detect all the faults detected by the original test suite. We find that FBDL can be up to 52.2%, which is higher than suggested by traditional TSR metrics. Moreover, traditional TSR metrics are not good predictors of FBDL, making it difficult for developers to decide whether to use reduced test suites.
AB - Test-suite reduction (TSR) speeds up regression testing by removing redundant tests from the test suite, thus running fewer tests in the future builds. To decide whether to use TSR or not, a developer needs some way to predict howwell the reduced test suite will detect real faults in the future compared to the original test suite. Prior research evaluated the cost of TSR using only program versions with seeded faults, but such evaluations do not explicitly predict the effectiveness of the reduced test suite in future builds. We perform the first extensive study of TSR using real test failures in (failed) builds that occurred for real code changes. We analyze 1478 failed builds from 32 GitHub projects that run their tests on Travis. Each failed build can have multiple faults, so we propose a family of mappings from test failures to faults. We use these mappings to compute Failed-Build Detection Loss (FBDL), the percentage of failed builds where the reduced test suite misses to detect all the faults detected by the original test suite. We find that FBDL can be up to 52.2%, which is higher than suggested by traditional TSR metrics. Moreover, traditional TSR metrics are not good predictors of FBDL, making it difficult for developers to decide whether to use reduced test suites.
KW - Continuous integration
KW - Regression testing
KW - Test-suite reduction
UR - http://www.scopus.com/inward/record.url?scp=85051501888&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85051501888&partnerID=8YFLogxK
U2 - 10.1145/3213846.3213875
DO - 10.1145/3213846.3213875
M3 - Conference contribution
AN - SCOPUS:85051501888
T3 - ISSTA 2018 - Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis
SP - 84
EP - 94
BT - ISSTA 2018 - Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis
A2 - Bodden, Eric
A2 - Tip, Frank
PB - Association for Computing Machinery
T2 - 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2018
Y2 - 16 July 2018 through 21 July 2018
ER -