Abstract

Software Fault Prediction (SFP) is a method of estimation of faulty modules of the software. It helps developers, testers as well as maintainers to develop good quality software modules with proper refactoring and less complexity. Many classification models have been developed by researchers to classify faulty modules in software like Decision Tree, SVM, ANN, and Random Forest, etc. Out of these techniques, the Decision Tree is one of the widely used approaches, which works on the greedy strategy for node splitting. Many alternates of node splitting have been attempted in the recent past to generate the decision tree for SFP such as Gini-Index, CHID, Gain Ratio, and Information Gain, etc. each of these has its pros and cons, but none of these individually perform well in terms of accuracy and/or efficiency. This paper proposes a new node splitting method based on combination of Gini-Index and Entropy for generating the decision tree. A new node splitting method called EGIA has been developed by integrating both node-splitting alternates. The proposed approach has been tested on 18 open source datasets of software implemented in different languages (C, C++, and Java). Results show that the proposed approach generates more accurate decision trees and can predict the software faulty modules better than the existing approaches. At the same time, the proposed method of node splitting is more dynamic than previous alternates because it can be modified according to the user's need and can be integrated with different optimization methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call