Abstract

Software defect prediction (SDP) is a convenient way to identify defects in the early phases of the software development life cycle. This early warning system can help in the removal of software defects and yield a cost-effective and good quality of software products. A wide range of statistical and machine learning models have been employed to predict defects in software modules. But the imbalanced nature of this type of SDP datasets is pivotal for the successful development of a defect prediction model. Imbalanced software datasets contain nonuniform class distributions with a few instances belonging to a specific class compared to that of the other class. This article proposes a novel hybrid methodology, namely the Hellinger net model, for imbalanced learning to improve defect prediction for software modules. Hellinger net, a tree to network mapped model, is a deep feedforward neural network with a built-in hierarchy, just like decision trees. Hellinger net also utilizes the strength of a skew insensitive distance measure, namely Hellinger distance, in handling class imbalance problems. On the theoretical side, this article proves the theoretical consistency of the proposed model. A thorough experiment was conducted over ten NASA SDP datasets to show the superiority of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call