TY - GEN
T1 - Quantizing Large-Language Models for Predicting Flaky Tests
AU - Rahman, Shanto
AU - Baz, Abdelrahman
AU - Misailovic, Sasa
AU - Shi, August
N1 - We acknowledge NSF grants no. CCF-2145774 and CCF-2313028, and the Jarmon Innovation Fund.
PY - 2024
Y1 - 2024
N2 - A major challenge in regression testing practice is the presence of flaky tests, which non-deterministically pass or fail when run on the same code. Previous research identified multiple categories of flaky tests. Prior research has also de-veloped techniques for automatically detecting which tests are flaky or categorizing flaky tests, but these techniques generally involve repeatedly rerunning tests in various ways, making them costly to use. Although several recent approaches have utilized large-language models (LLMs) to predict which tests are flaky or predict flaky-test categories without needing to rerun tests, they are costly to use due to relying on a large neural network to perform feature extraction and prediction. We propose FlakyQ to improve the effectiveness of LLM-based flaky-test prediction by quantizing LLM's weights. The quantized LLM can extract features from test code more efficiently. To make up for loss in prediction performance due to quantization, we further train a traditional ML classifier (e.g., a random forest) to learn from the quantized LLM-extracted features and do the same prediction. The final model has similar prediction performance while running faster than the non-quantized LLM. Our evaluation finds that FlakyQ classifiers consistently improves prediction time over the non-quantized LLM classifier, saving 25.4% in prediction time over all tests, along with a 48.4 % reduction in memory usage. Furthermore, prediction performance is equal or better than the non-quantized LLM classifier.
AB - A major challenge in regression testing practice is the presence of flaky tests, which non-deterministically pass or fail when run on the same code. Previous research identified multiple categories of flaky tests. Prior research has also de-veloped techniques for automatically detecting which tests are flaky or categorizing flaky tests, but these techniques generally involve repeatedly rerunning tests in various ways, making them costly to use. Although several recent approaches have utilized large-language models (LLMs) to predict which tests are flaky or predict flaky-test categories without needing to rerun tests, they are costly to use due to relying on a large neural network to perform feature extraction and prediction. We propose FlakyQ to improve the effectiveness of LLM-based flaky-test prediction by quantizing LLM's weights. The quantized LLM can extract features from test code more efficiently. To make up for loss in prediction performance due to quantization, we further train a traditional ML classifier (e.g., a random forest) to learn from the quantized LLM-extracted features and do the same prediction. The final model has similar prediction performance while running faster than the non-quantized LLM. Our evaluation finds that FlakyQ classifiers consistently improves prediction time over the non-quantized LLM classifier, saving 25.4% in prediction time over all tests, along with a 48.4 % reduction in memory usage. Furthermore, prediction performance is equal or better than the non-quantized LLM classifier.
KW - Flaky Test Categorization
KW - Large-Language Models
KW - Quantization
UR - http://www.scopus.com/inward/record.url?scp=85196798233&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85196798233&partnerID=8YFLogxK
U2 - 10.1109/ICST60714.2024.00018
DO - 10.1109/ICST60714.2024.00018
M3 - Conference contribution
AN - SCOPUS:85196798233
T3 - Proceedings - 2024 IEEE Conference on Software Testing, Verification and Validation, ICST 2024
SP - 93
EP - 104
BT - Proceedings - 2024 IEEE Conference on Software Testing, Verification and Validation, ICST 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 17th IEEE Conference on Software Testing, Verification and Validation, ICST 2024
Y2 - 27 May 2024 through 31 May 2024
ER -