ENSEMBLE OF SOFTWARE DEFECT PREDICTORS: AN AHP-BASED EVALUATION METHOD

YI PENG,GANG KOU,WENSHUAI WU,GUOXUN WANG,YONG SHI

doi:10.1142/s0219622011004282

Abstract

Classification algorithms that help to identify software defects or faults play a crucial role in software risk management. Experimental results have shown that ensemble of classifiers are often more accurate and robust to the effects of noisy data, and achieve lower average error rate than any of the constituent classifiers. However, inconsistencies exist in different studies and the performances of learning algorithms may vary using different performance measures and under different circumstances. Therefore, more research is needed to evaluate the performance of ensemble algorithms in software defect prediction. The goal of this paper is to assess the quality of ensemble methods in software defect prediction with the analytic hierarchy process (AHP), which is a multicriteria decision-making approach that prioritizes decision alternatives based on pairwise comparisons. Through the application of the AHP, this study compares experimentally the performance of several popular ensemble methods using 13 different performance metrics over 10 public-domain software defect datasets from the NASA Metrics Data Program (MDP) repository. The results indicate that ensemble methods can improve the classification results of software defect prediction in general and AdaBoost gives the best results. In addition, tree and rule based classifiers perform better in software defect prediction than other types of classifiers included in the experiment. In terms of single classifier, K-nearest-neighbor, C4.5, and Naïve Bayes tree ranked higher than other classifiers.

Highlights

Large and complex software systems have become an essential part of our society
Software defect prediction can be modeled as a two-group classification problem by categorizing software units as either fault-prone or nonfault-prone using historical data
The objective of this study is to evaluate the quality of ensemble methods for software defect prediction with the analytic hierarchy process (AHP) method

Summary

Introduction

Large and complex software systems have become an essential part of our society. Defects existing in software systems are prevalent and expensive. Researchers have developed many classification models for software defect prediction.[2,3,4,5,6,7,8,9,10,11,12,13,14,15] Previous studies illustrate that ensemble methods, a combination of classifiers using some mechanisms, are superior to others in software defect prediction.[2,16] other works indicate that classifiers’ performances may vary using different performance measures and under different circumstances.[17,18,19,20] there are many ways to construct ensembles of classifiers. How to select the most appropriate ensemble method for software defect prediction problem has not been fully investigated

Objectives

Methods

Results

Conclusion