Improving defect prediction with deep forest

Tianchi Zhou,Xin Xia,Xiang Chen,Xiaobing Sun,Bin Li

doi:10.1016/j.infsof.2019.07.003

Abstract

Abstract Context Software defect prediction is important to ensure the quality of software. Nowadays, many supervised learning techniques have been applied to identify defective instances (e.g., methods, classes, and modules). Objective However, the performance of these supervised learning techniques are still far from satisfactory, and it will be important to design more advanced techniques to improve the performance of defect prediction models. Method We propose a new deep forest model to build the defect prediction model (DPDF). This model can identify more important defect features by using a new cascade strategy, which transforms random forest classifiers into a layer-by-layer structure. This design takes full advantage of ensemble learning and deep learning. Results We evaluate our approach on 25 open source projects from four public datasets (i.e., NASA, PROMISE, AEEEM and Relink). Experimental results show that our approach increases AUC value by 5% compared with the best traditional machine learning algorithms. Conclusion The deep strategy in DPDF is effective for software defect prediction.

Full Text