The Stability of Threshold Values for Software Metrics in Software Defect Prediction

Goran Mauša,Tihana Galinac Grbac

doi:10.1007/978-3-319-66854-3_7

Abstract

Software metrics measure the complexity and quality in many empirical case studies. Recent studies have shown that threshold values can be detected for some metrics and used to predict defect-prone system modules. The goal of this paper is to empirically validate the stability of threshold values. Our aim is to analyze a wider set of software metrics than it has been previously reported and to perform the analysis in the context of different levels of data imbalance. We replicate the case study of deriving thresholds for software metrics using a statistical model based on logistic regression. Furthermore, we analyze threshold stability in the context of varying level of data imbalance. The methodology is validated using a great number of subsequent releases of open source projects. We revealed that threshold values of some metrics could be used to effectively predict defect-prone modules. Moreover, threshold values of some metrics may be influenced by the level of data imbalance. The results of this case study give a valuable insight into the importance of software metrics and the presented methodology may also be used by software quality assurance practitioners.

Full Text