An Adaptive Rank Aggregation-Based Ensemble Multi-Filter Feature Selection Method in Software Defect Prediction.

Abdullateef O Balogun,Ganesh Kumar,Victor E Adeyemo,Saipunidzam Mahamad,Malek A Almomani,Shuib Basri,Abdullahi A Imam,Luiz Fernando Capretz

doi:10.3390/e23101274

Abstract

Feature selection is known to be an applicable solution to address the problem of high dimensionality in software defect prediction (SDP). However, choosing an appropriate filter feature selection (FFS) method that will generate and guarantee optimal features in SDP is an open research issue, known as the filter rank selection problem. As a solution, the combination of multiple filter methods can alleviate the filter rank selection problem. In this study, a novel adaptive rank aggregation-based ensemble multi-filter feature selection (AREMFFS) method is proposed to resolve high dimensionality and filter rank selection problems in SDP. Specifically, the proposed AREMFFS method is based on assessing and combining the strengths of individual FFS methods by aggregating multiple rank lists in the generation and subsequent selection of top-ranked features to be used in the SDP process. The efficacy of the proposed AREMFFS method is evaluated with decision tree (DT) and naïve Bayes (NB) models on defect datasets from different repositories with diverse defect granularities. Findings from the experimental results indicated the superiority of AREMFFS over other baseline FFS methods that were evaluated, existing rank aggregation based multi-filter FS methods, and variants of AREMFFS as developed in this study. That is, the proposed AREMFFS method not only had a superior effect on prediction performances of SDP models but also outperformed baseline FS methods and existing rank aggregation based multi-filter FS methods. Therefore, this study recommends the combination of multiple FFS methods to utilize the strength of respective FFS methods and take advantage of filter–filter relationships in selecting optimal features for SDP processes.

Highlights

Scenario A is based on assessing and comparing the prediction performances of naïve Bayes (NB) and decision tree (DT) models based on proposed aggregation-based ensemble multi-filter feature selection (AREMFFS) and baseline feature selection (FS) (CS, information gain (IG), REF, and NoFS) methods
Scenario B is defined by evaluating and comparing the prediction performances of NB and DT models based on the proposed AREMFFS method and the existing (Min, Max, Mean, Range, GMean, HMean) rank aggregation-based multi-filter FS methods
This study focuses on resolving high dimensionality and filter rank selection problems in software defect prediction by proposing a novel AREMFFS method

Summary

Introduction

Selecting a fitting FFS method for SDP is a problem This is based on findings from existing studies on the impact of FSS in SDP, which concluded that there is no one best FSS method and that their respective performances depend on selected datasets and classifiers [15,19,21,22]. This observation can be due to incomplete and disjointed feature ranking of FFS methods in SDP.

Related Works

Classification Algorithms

Feature Selection Method

Multi-Filter FS Phase

Ensemble Rank Aggregation Phase

Backtracking Function Phase

Software Defect Datasets

Experimental Procedure

Performance Evaluation Metrics

Results and Discussion

Experimental Results on Scenario A

Box-plot

Scott–KnottESD

Experimental Results on Scenario B

10. Box-plot

13. Scott–KnottESD

16. Box-plot

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Entropy (Basel, Switzerland)	Publication Date: Sep 29, 2021
Citations: 13	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An Adaptive Rank Aggregation-Based Ensemble Multi-Filter Feature Selection Method in Software Defect Prediction.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy (Basel, Switzerland)

Lead the way for us

Similar Papers

A Novel Rank Aggregation-Based Hybrid Multifilter Wrapper Feature Selection Method in Software Defect Prediction.
Luiz Fernando Capretz ... Abdullateef O Balogun
Computational intelligence and neuroscience | VOL. 2021
Luiz Fernando Capretz, et. al.Luiz Fernando Capretz ... Abdullateef O Balogun
24 Nov 2021
Computational intelligence and neuroscience | VOL. 2021

Performance analysis of different classification algorithms using different feature selection methods on Parkinson's disease detection
Ozkan Cigdem ... Hasan Demirel
Journal of Neuroscience Methods | VOL. 309
Ozkan Cigdem, et. al.Ozkan Cigdem ... Hasan Demirel
01 Sep 2018
Journal of Neuroscience Methods | VOL. 309

A Systematic Review of Feature Selection Techniques in Software Quality Prediction
Marc Roper ... Hadeel Alsolai
-
Marc Roper, et. al.Marc Roper ... Hadeel Alsolai
01 Nov 2019
01 Nov 2019

The Effectiveness of the Fused Weighted Filter Feature Selection Method to Improve Software Fault Prediction
Fatemeh Alighardashi ... Mohammad Ali Zare Chahooki
Journal of Communications Technology, Electronics and Computer Science | VOL. 8
Fatemeh Alighardashi, et. al.Fatemeh Alighardashi ... Mohammad Ali Zare Chahooki
03 Nov 2016
Journal of Communications Technology, Electronics and Computer Science | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Adaptive Rank Aggregation-Based Ensemble Multi-Filter Feature Selection Method in Software Defect Prediction.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy (Basel, Switzerland)