Abstract

Imbalanced data is a significant issue in software fault prediction. It is very challenging for software engineers to handle imbalanced software fault data for the early prediction of software faults. In the last two decades, many researchers have used synthetic minority oversampling technique (SMOTE), SMOTE for regression and other such techniques to preprocess the imbalanced software fault data. However, these preprocessing techniques do not produce consistently good accuracy, especially in inter release, and cross project fault prediction. The learning of imbalanced fault data for prediction of the number of software faults has not been explored in depth so far. To deal with this scenario, we have explored an efficient machine learning technique, namely extreme learning machine (ELM) for prediction of the number of software faults. Furthermore, a new variant of ELM, namely weighted regularization ELM, is proposed to generalize the imbalanced data to balanced data. To validate the proposed imbalanced learning model, we have used 26 open source PROMISE software fault datasets and three prediction scenarios, intra release, inter release, and cross project. We have conducted the experiments for prediction of the number of faults. The experimental results showed that the proposed approach led to improved performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call