Multi-Dimension Convolutional Neural Network for Bug Localization

Bei Wang,Chao Liu,Ling Liu,Meng Yan,Ling Xu

doi:10.1109/tsc.2020.3006214

Abstract

Software bugs remain frequent in the life cycle of software development and maintenance. Automatic localization of buggy source code files is critical for timely bug fixing and improving the efficiency of software quality assurance. Various bug localization techniques have been proposed using different dimensions of features. Recent studies have shown that different dimensions of features may play different roles in bug localization. Unfortunately, how to effectively merge these dimensions of features for improving bug localization has rarely been investigated. This article presents a Multi-Dimension Convolutional Neural Network (MD-CNN) model for bug localization automatically based on a bug report. Our approach has dual-novelty. First, we identify and extract five statistical dimensions of features. Second, we design a Convolutional Neural Network (CNN) model that takes our five statistical dimensions of features as the input and iteratively learns the complex and non-linear relationship between the features and the bug locations. The MD-CNN bug localization model is verified using six large-scale open source projects. The experimental results show that our MD-CNN outperforms the existing representative bug localization techniques in terms of the Mean Average Precision (MAP) and the number of bugs successfully localized in the top 1, 5, and 10 matched source code files.

Full Text