Can ChatGPT Repair Non-Order-Dependent Flaky Tests?

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Regression testing helps developers check whether the latest code changes break software functionality. Flaky tests, which can non-deterministically pass or fail on the same code version, may mislead developers' concerns, resulting in missing some bugs or spending time pinpointing bugs that do not exist. Existing flakiness detection and mitigation techniques have primarily focused on general order-dependent (OD) and implementation-dependent (ID) flaky tests. There is also a dearth of research on repairing test flakiness, out of which, mostly have focused on repairing OD flaky tests, and a few have explored repairing a subcategory of non-order-dependent (NOD) flaky tests that are caused by asynchronous waits. As a result, there is a demand for devising techniques to reproduce, detect, and repair NOD flaky tests. Large language models (LLMs) have shown great effectiveness in several programming tasks. To explore the potential of LLMs in addressing NOD flakiness, this paper investigates the possibility of using ChatGPT to repair different categories of NOD flaky tests. Our comprehensive study on 118 from the IDoFT dataset shows that ChatGPT, despite as a leading LLM with notable success in multiple code generation tasks, is ineffective in repairing NOD test flakiness, even by following the best practices for prompt crafting. We investigated the reasons behind the failure of using ChatGPT in repairing NOD tests, which provided us valuable insights about the next step to advance the field of NOD test flakiness repair.

Original languageEnglish (US)
Title of host publicationProceedings - 2024 IEEE/ACM International Flaky Tests Workshop, FTW 2024
PublisherAssociation for Computing Machinery
Pages22-29
Number of pages8
ISBN (Electronic)9798400705588
DOIs
StatePublished - Apr 14 2024
Event1st International Flaky Tests Workshop, FTW 2024, co-located with the 46th ACM/IEEE International Conference on Software Engineering, ICSE 2024 - Lisbon, Portugal
Duration: Apr 14 2024 → …

Publication series

NameProceedings - 2024 IEEE/ACM International Flaky Tests Workshop, FTW 2024

Conference

Conference1st International Flaky Tests Workshop, FTW 2024, co-located with the 46th ACM/IEEE International Conference on Software Engineering, ICSE 2024
Country/TerritoryPortugal
CityLisbon
Period4/14/24 → …

Keywords

  • large language models
  • software testing
  • test flakiness

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'Can ChatGPT Repair Non-Order-Dependent Flaky Tests?'. Together they form a unique fingerprint.

Cite this