Abstract

Identifying and selecting the most consistent subset of metrics which improves the performance of software defect prediction model is paramount but challenging problem as it receives little attention in literature. The current research aimed at investigating the consistency of subsets of metrics that are produced by embedded feature selection techniques. Ten (10) feature selection techniques used from the families of filter and wrapper-based feature selection techniques commonly used in the defect prediction domain. Ten (10) publicly available defect datasets were studied which span both proprietary and open source domains. SVM-RFE-RF presented 42-93% consistent metrics across datasets. While the prior study on non-Embedded produced 56.5% consistent metrics at median. SVM-RFE-LF approach of Embedded Feature Selection Technique produced 54-80% consistent metrics across datasets and 42.5% at median. To state the purpose of tittle has been achieved Embedded based Feature Selection Techniques produced most efficient consistent subset selection across the entire datasets and amongst the feature selection techniques as compared with counterpart filter and wrapper-based feature selection techniques

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call