Deep Learning With Customized Abstract Syntax Tree for Bug Localization

Hongliang Liang,Yuxing Yang,Meilin Wang,Lu Sun

doi:10.1109/access.2019.2936948

Abstract

Given a bug report, bug localization technique can help developers automatically locate potential buggy files. Information retrieval and deep learning approaches have been applied in bug localization by extracting lexical features in bug reports and syntactic features in source code files, though they fail to utilize the structural and semantic information of source code files. In this paper, we present a bug localization system CAST, which exploits deep learning and customized abstract syntax trees of programs to locate potential buggy source files automatically and effectively. Specifically, CAST extracts both lexical semantics in bug reports (e.g., words) and source files (e.g., method names) and program semantics in source files (e.g., abstract syntax tree, AST). Moreover, CAST enhances the tree-based convolutional neural network (TBCNN) model with customized ASTs, which distinguish between user-defined methods and system-provided ones to reflect their contributions leading to defects. Furthermore, customized ASTs group the syntactic entities with similar semantics and prune the ones with little or redundant semantics in order to facilitate the learning performance. Experimental results on four widely-used software projects show that CAST significantly outperforms the state-of-the-art methods in locating the buggy source files.

Highlights

For large and evolving software, developers may receive a large number of bug reports, and it is difficult and costly to manually locate the potential buggy source files based on bug reports
EVALUATION To evaluate the performance of CAST, we focus on four research questions (RQ) as follows: RQ1 What effect do the different model settings have on CAST? When building CAST, we need to determine the suitable values of hyper-parameters
RQ2 Can CAST outperform other bug localization methods? To evaluate the capability of CAST, we compare CAST with four state-of-the-art tools in bug localization (BugLocator [28], DNNLOC [29], DeepLocator [35], NP-convolutional neural network (CNN) [22])

Summary

INTRODUCTION

For large and evolving software, developers may receive a large number of bug reports, and it is difficult and costly to manually locate the potential buggy source files based on bug reports. Liang et al.: Deep Learning With Customized Abstract Syntax Tree for Bug Localization These tools can model natural and programming language for bug localization, there is room for improvement on accuracy and performance. CAST leverages CNN to extract rich lexical semantic features, which indicate the relationship between syntactic entities, e.g. words or methods in bug reports and source files, and exploits TBCNN [9] on customized ASTs to capture hierarchical structure features, which contain the structural or semantic relation of program statements in source code files. It differentiates user-defined methods and system-provided ones to reflect their contributions leading to defects, which is helpful to improve the accuracy of bug localization It groups the syntactic entities with similar semantics and prunes the ones with little or redundant semantics to facilitate the learning performance.

MOTIVATION

WORD EMBEDDING

FEATURE EXTRACTION

FEATURE COMBINATION

OPTIMIZATION FUNCTION

EVALUATION

EXPERIMENTAL RESULTS AND ANALYSIS Answer to RQ1

THREATS TO VALIDITY

VIII. CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 44	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Deep Learning With Customized Abstract Syntax Tree for Bug Localization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Bug Localization with Features Crossing and Structured Semantic Information Matching
Guoqing Xu ... Bin Chen
International Journal of Software Engineering and Knowledge Engineering | VOL. 33
Guoqing Xu, et. al.Guoqing Xu ... Bin Chen
29 Jul 2023
International Journal of Software Engineering and Knowledge Engineering | VOL. 33

Cross-language bug localization
Xin Xia ... David Lo
-
Xin Xia, et. al.Xin Xia ... David Lo
02 Jun 2014
02 Jun 2014

FineLocator: A novel approach to method-level fine-grained bug localization by query expansion
Wen Zhang ... Ziqiang Li
Information and Software Technology | VOL. 110
Wen Zhang, et. al.Wen Zhang ... Ziqiang Li
02 Mar 2019
Information and Software Technology | VOL. 110

Locating relevant source files for bug reports using textual analysis
Reza Gharibi ... Mohammad Hadi Sadreddini
-
Reza Gharibi, et. al.Reza Gharibi ... Mohammad Hadi Sadreddini
01 Oct 2017
01 Oct 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deep Learning With Customized Abstract Syntax Tree for Bug Localization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access