BL-GAN: Semi-Supervised Bug Localization via Generative Adversarial Network

Ziye Zhu, Hanghang Tong, Yu Wang, Yun Li

Research output: Contribution to journalArticlepeer-review

Abstract

Various automated bug localization technologies have recently emerged that require adequate bug-fix records available to train a predictive model. However, many projects in practice might not provide these necessities, especially for new projects in the first release, due to the expensive human effort for constructing a large amount of bug-fix records. Aiming to capture the potential relevance distribution between the bug report and code file from a limited number of available bug-fix records, we present the first semi-supervised bug localization model named BL-GAN in this paper. For this purpose, the promising Generative Adversarial Network is introduced in BL-GAN, in which synthetic bug-fix records close to the real ones are constructed by searching the project directory tree to generate file paths instead of traversing the contents of all code files. For processing bug reports, the proposed BL-GAN adopts an attention-based Transformer architecture to capture semantic and sequence information. In order to capture the proprietary structural information in code files, BL-GAN incorporates a novel multilayer Graph Convolutional Network to process the source code in a graphical view. Extensive experiments on large-scale real-world datasets reveal that our model BL-GAN significantly outperforms the state-of-the-art on all evaluation measures.

Original languageEnglish (US)
Pages (from-to)11112-11125
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Volume35
Issue number11
DOIs
StatePublished - Nov 1 2023

Keywords

  • Bug localization
  • bug report
  • generative adversarial network
  • semi-supervised learning

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'BL-GAN: Semi-Supervised Bug Localization via Generative Adversarial Network'. Together they form a unique fingerprint.

Cite this