Optimal Feature Selection through Search-Based Optimizer in Cross Project

Rizwan Bin Faiz,Hafiz Tayyab Rauf,Mohamed Sharaf,Saman Shaheen

doi:10.3390/electronics12030514

Rizwan Bin Faiz, Hafiz Tayyab Rauf + Show 2 more

Open Access

https://doi.org/10.3390/electronics12030514

Copy DOI

Abstract

Cross project defect prediction (CPDP) is a key method for estimating defect-prone modules of software products. CPDP is a tempting approach since it provides information about predicted defects for those projects in which data are insufficient. Recent studies specifically include instructions on how to pick training data from large datasets using feature selection (FS) process which contributes the most in the end results. The classifier helps classify the picked-up dataset in specified classes in order to predict the defective and non-defective classes. The aim of our research is to select the optimal set of features from multi-class data through a search-based optimizer for CPDP. We used the explanatory research type and quantitative approach for our experimentation. We have F1 measure as our dependent variable while as independent variables we have KNN filter, ANN filter, random forest ensemble (RFE) model, genetic algorithm (GA), and classifiers as manipulative independent variables. Our experiment follows 1 factor 1 treatment (1F1T) for RQ1 whereas for RQ2, RQ3, and RQ4, there are 1 factor 2 treatments (1F2T) design. We first carried out the explanatory data analysis (EDA) to know the nature of our dataset. Then we pre-processed our data by removing and solving the issues identified. During data preprocessing, we analyze that we have multi-class data; therefore, we first rank features and select multiple feature sets using the info gain algorithm to get maximum variation in features for multi-class dataset. To remove noise, we use ANN-filter and get significant results more than 40% to 60% compared to NN filter with base paper (all, ckloc, IG). Then we applied search-based optimizer i.e., random forest ensemble (RFE) to get the best features set for a software prediction model and we get 30% to 50% significant results compared with genetic instance selection (GIS). Then we used a classifier to predict defects for CPDP. We compare the results of the classifier with base paper classifier using F1-measure and we get almost 35% more than base paper. We validate the experiment using Wilcoxon and Cohen’s d test.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronics	Publication Date: Jan 19, 2023
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Optimal Feature Selection through Search-Based Optimizer in Cross Project

Abstract

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

Impact of Optimal Feature Selection Using Hybrid Method for a Multiclass Problem in Cross Project Defect Prediction
Abeer Jalil ... Rizwan Bin Faiz
Applied Sciences | VOL. 12
Abeer Jalil, et. al.Abeer Jalil ... Rizwan Bin Faiz
28 Nov 2022
Applied Sciences | VOL. 12

A benchmark study on the effectiveness of search-based data selection and feature selection for cross project defect prediction
Seyedrebvar Hosseini ... Mika Mäntylä
Information and Software Technology | VOL. 95
Seyedrebvar Hosseini, et. al.Seyedrebvar Hosseini ... Mika Mäntylä
22 Jun 2017
Information and Software Technology | VOL. 95

Performance Evaluation of Convolutional Neural Network for Multi-Class in Cross Project Defect Prediction
Sundas Noreen ... Rizwan Bin Faiz
Applied Sciences | VOL. 12
Sundas Noreen, et. al.Sundas Noreen ... Rizwan Bin Faiz
30 Nov 2022
Applied Sciences | VOL. 12

MVSE: Effort-Aware Heterogeneous Defect Prediction via Multiple-View Spectral Embedding
Zhou Xu ... Sizhe Ye
-
Zhou Xu, et. al.Zhou Xu ... Sizhe Ye
01 Jul 2019
01 Jul 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimal Feature Selection through Search-Based Optimizer in Cross Project

Abstract

Talk to us

Similar Papers

More From: Electronics