Abstract

Defects can cause significant software rework, delays, and high costs, to prevent disability it must be predictable the possibility of defects. To predict the disability the metrics software dataset is used. NASA MDP is one of the popular software metrics used to predict software defects by having 13 datasets and is generally unbalanced. The reward in the dataset can reduce the prediction of software defects because more unbalanced data produces a majority class. Data imbalance can be handled with 2 approaches, namely the data level approach technique and the algorithm level approach technique. The data level approach technique aims to improve class distribution by using resampling and data synthesis techniques. This research proposes a data level approach using resampling techniques, namely Random Oversampling (ROS), Random Undersampling (RUS), Synthetic Minority Oversampling Technique (SMOTE), Tomek Link (TL) and One-Sided Selection (OSS) which are classified with Naïve Bayes was also validated using 10 Fold Cross-Validation, then evaluated with the Area Under ROC Curve (AUC). Prediction results based on the dataset obtained the best AUC value on MC2 with a value of 0.7277 using the Synthetic Minority Oversampling Technique (SMOTE). Prediction results based on the data level approach technique obtained the best average AUC value using Tomek Link (TL) with a value of 0.62587. Prediction results based on the dataset obtained the best AUC value on MC2 with a value of 0.7277 using the Synthetic Minority Oversampling Technique (SMOTE). Prediction results based on the data level approach technique obtained the best average AUC value using Tomek Link (TL) with a value of 0.62587. Prediction results based on the dataset obtained the best AUC value on MC2 with a value of 0.7277 using the Synthetic Minority Oversampling Technique (SMOTE). Prediction results based on the data level approach technique obtained the best average AUC value using Tomek Link (TL) with a value of 0.62587.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call