Abstract

It is a well-known practice in software engineering to aggregate software metrics to assess software artifacts for various purposes, such as their maintainability or their proneness to contain bugs. For different purposes, different metrics might be relevant. However, weighting these software metrics according to their contribution to the respective purpose is a challenging task. Manual approaches based on experts do not scale with the number of metrics. Also, experts get confused if the metrics are not independent, which is rarely the case. Automated approaches based on supervised learning require reliable and generalizable training data, a ground truth, which is rarely available. We propose an automated approach to weighted metrics aggregation that is based on unsupervised learning. It sets metrics scores and their weights based on probability theory and aggregates them. To evaluate the effectiveness, we conducted two empirical studies on defect prediction, one on ca. 200 000 code changes, and another ca. 5 000 software classes. The results show that our approach can be used as an agnostic unsupervised predictor in the absence of a ground truth.

Highlights

  • Quality assurance is usually done with a limited budget

  • Details about the Chidamber and Kemerer (CK)+OO metrics, their churn and entropy, and the variants thereof can be found in the original paper by D’Ambros et al (2010). They found out that the model variant based on weighted churn achieved the best performance among of models build from the churn of CK+OO, and that the model based on linearly decayed entropy achieved the best performance among of models build from entropy of these metrics

  • We conclude that our unsupervised model can improve the performance of the simplesupervised models based on code change metrics, i.e., applying our approach for aggregation of the whole set of metrics performs better than models built on a single well-chosen metric

Read more

Summary

Introduction

Quality assurance is usually done with a limited budget. its activities must be performed as efficiently as possible. Understanding which software artifacts are likely to be problematic helps to prioritize activities and to allocate resources . To gain this knowledge, software quality assessment employs weighted software metrics. We could understand defect prediction as an instance of multi-criteria decision making where we decide, which software artifact is prone to contain defects. In both cases, defect prediction inherits problems connected to the dependencies between criteria or metrics and connected to the subjective setting of weights, as detailed in Sections 2.1 and 2.2, respectively.

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call