Simplify Your Neural Networks: An Empirical Study on Cross-Project Defect Prediction

Ruchika Malhotra,Abuzar Ahmed Khan,Amrit Khera

doi:10.1007/978-981-16-3728-5_7

Abstract

Ensuring software quality, when every industry depends on software, is of utmost importance. Software bugs can have grave consequences and thus identifying and fixing them becomes imperative for developers. Software defect prediction (SDP) focuses on identifying defect-prone areas so that testing resources can be allocated judiciously. Sourcing historical data is not easy, especially in the case of new software, and this is where cross-project defect prediction (CPDP) comes in. SDP, and specifically CPDP, have both attracted the attention of the research community. Simultaneously, the versatility of neural networks (NN) has pushed researchers to study the applications of NNs to defect prediction. In most research, the architecture of a NN is arrived at through trial-and-error. This requires effort, which can be reduced if there is a general idea about what kind of architecture works well in a particular field. In this paper, we tackle this problem by testing six different NNs on a dataset of twenty software from the PROMISE repository in a strict CPDP setting. We then compare the best architecture to three proposed methods for CPDP, which cover a wide range of scenarios. During our research, we found that the simplest NN with dropout layers (NN-SD) performed the best and was also statistically significantly better than the CPDP methods it was compared with. We used the area under the curve for receiver operating characteristics (AUC-ROC) as the performance metric, and for testing statistical significance, we use the Friedman chi-squared test and the Wilcoxon signed-rank test.KeywordsNeural networksMachine learningSoftware defect predictionCross-project defect predictionSoftware quality.

Full Text