Abstract

The software defect prediction is the most operative research domain in software engineering as it enhances its reliability. The availability of defect data related to different projects leads to cross project defect prediction an open issue. This paper instanced on multiclass/multinomial classification of defect prediction on different categories of cross projects. The ensemble learning statistical models – Random forest and Gradient Boosting are used for classification. An empirical study is carried out to determine the predictive performance of the within project and cross project prediction models. Depending on the number of defects, class level information is classified into one of three defined multiclass. The homogeneous set of object oriented metrics is used for training the model. Furthermore, k-fold cross validation is done to evaluate the training accuracy of the statistical models. Major outcome of the paper concludes that multinomial/multiclass classification is applicable on cross project data and has comparable results to within project defect data with statistical significance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call