Which process metrics can significantly improve defect prediction models? An empirical study

Lech Madeyski,Marian Jureczko

doi:10.1007/s11219-014-9241-7

Abstract

The knowledge about the software metrics which serve as defect indicators is vital for the efficient allocation of resources for quality assurance. It is the process metrics, although sometimes difficult to collect, which have recently become popular with regard to defect prediction. However, in order to identify rightly the process metrics which are actually worth collecting, we need the evidence validating their ability to improve the product metric-based defect prediction models. This paper presents an empirical evaluation in which several process metrics were investigated in order to identify the ones which significantly improve the defect prediction models based on product metrics. Data from a wide range of software projects (both, industrial and open source) were collected. The predictions of the models that use only product metrics (simple models) were compared with the predictions of the models which used product metrics, as well as one of the process metrics under scrutiny (advanced models). To decide whether the improvements were significant or not, statistical tests were performed and effect sizes were calculated. The advanced defect prediction models trained on a data set containing product metrics and additionally Number of Distinct Committers (NDC) were significantly better than the simple models without NDC, while the effect size was medium and the probability of superiority (PS) of the advanced models over simple ones was high ( $$p=.016$$ p = . 016 , $$r=-.29$$ r = - . 29 , $$\hbox {PS}=.76$$ PS = . 76 ), which is a substantial finding useful in defect prediction. A similar result with slightly smaller PS was achieved by the advanced models trained on a data set containing product metrics and additionally all of the investigated process metrics ( $$p=.038$$ p = . 038 , $$r=-.29$$ r = - . 29 , $$\hbox {PS}=.68$$ PS = . 68 ). The advanced models trained on a data set containing product metrics and additionally Number of Modified Lines (NML) were significantly better than the simple models without NML, but the effect size was small ( $$p=.038$$ p = . 038 , $$r=.06$$ r = . 06 ). Hence, it is reasonable to recommend the NDC process metric in building the defect prediction models.

Highlights

Software development companies are seeking for ways to improve the quality of software systems without allocating too many resources in the quality assurance activities such as testing
This paper presents an empirical evaluation in which several process metrics were investigated in order to identify the ones which significantly improve the defect prediction models based on product metrics
When the advanced models turned out to be significantly better than the simple ones, we calculated the effect size in order to assess whether the investigated process metric may be useful in practice of software defect prediction

Summary

Introduction

Software development companies are seeking for ways to improve the quality of software systems without allocating too many resources in the quality assurance activities such as testing. Applying the same testing effort to all modules of a software system is not an optimal approach, since the distribution of defects among individual parts of a system is not uniform. It is possible to test only a small part of a software system and find most of the defects. In turn, may be used to find the defect-prone classes. The quality assurance efforts should be focused (unless for critical projects) on the most defect-prone classes in order to save valuable time and financial resources, and, at the same time, to increase the quality of delivered software products

Methods

Findings

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Software Quality Journal	Publication Date: Jun 17, 2014
Citations: 204	License type: CC BY 4.0

R Discovery Prime

Which process metrics can significantly improve defect prediction models? An empirical study

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Software Quality Journal

Lead the way for us

Similar Papers

The Characteristics of False-Negatives in File-level Fault Prediction
Harold Valdivia-Garcia ... Meiyappan Nagappan
-
Harold Valdivia-Garcia, et. al.Harold Valdivia-Garcia ... Meiyappan Nagappan
08 Nov 2017
08 Nov 2017

An Empirical Study on Software Fault Prediction Using Product and Process Metrics
Raed Shatnawi ... Alok Mishra
International Journal of Information Technologies and Systems Approach | VOL. 14
Raed Shatnawi, et. al.Raed Shatnawi ... Alok Mishra
01 Jan 2020
International Journal of Information Technologies and Systems Approach | VOL. 14

Using product, process, and execution metrics to predict fault-prone software modules with classification trees
T.M Khoshgoftaar ... R Shan
-
T.M Khoshgoftaar, et. al.T.M Khoshgoftaar ... R Shan
01 Nov 2000
01 Nov 2000

Defect Prediction using Combined Product and Project Metrics - A Case Study from the Open Source "Apache" MyFaces Project Family
Dindin Wahyudin ... Dietmar Winkler
-
Dindin Wahyudin, et. al.Dindin Wahyudin ... Dietmar Winkler
01 Sep 2008
01 Sep 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Which process metrics can significantly improve defect prediction models? An empirical study

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Software Quality Journal