An empirical assessment of threshold techniques to discriminate the fault status of software

Navneet Kaur,Hardeep Singh

doi:10.1016/j.jksuci.2021.03.003

Abstract

To determine the high risk classes in software system researchers often turn to statistical and computational intelligence models, in preference to more easily performed binary classification through threshold value. In the later case, the knowledge of only threshold values can help the developers and testers to recognize the risk prone classes. The current study aims to identify the dichotomy that proficiently discriminates the faulty and non-faulty classes of the software systems. For the purpose, the study examined seven threshold techniques, i.e., odds ratio, Cohen’s kappa, maximum sum of specificity and sensitivity, Concordance Probability, Alves Rankings, value of an acceptable risk level, and standard deviation plus mean, to identify which ones recognize an optimal threshold for the software metrics. The dichotomous power of any threshold technique depends on the software measures, the optimal values of which are to be identified. This study made use of widely adopted object oriented metrics, Chidamber and Kemerer metric suite. The discrimination results of the techniques were further compared and the observations elicited from the results of experiments revealed that concordance probability and maximum sum of sensitivity and specificity achieved the best performance, whereas the odds ratio performed significantly worse than the best performing methods.

Full Text