Assessing feature selection method performance with class imbalance data

Surani Matharaarachchi,Mike Domaratzki,Saman Muthukumarana

doi:10.1016/j.mlwa.2021.100170

Surani Matharaarachchi, Mike Domaratzki + Show 1 more

Open Access

https://doi.org/10.1016/j.mlwa.2021.100170

Copy DOI

Export

Save

Cite

Journal: Machine Learning with Applications	Publication Date: Oct 5, 2021
Citations: 9	License type: cc-by-nc-nd

Affiliation: University of Manitoba

Abstract
Full-Text
Similar Papers

Abstract

Listen

Identifying the most informative features is a crucial step in feature selection. This paper focuses primarily on wrapper feature selection methods designed to detect important features with F1-score as the target metric. As an initial step, most wrapper methods order features according to importance. However, in most cases, the importance is defined according to the classification method used and varies with the characteristics of the data set. Using synthetically simulated data, we examine four existing feature ordering techniques to find the most desirable and the most effective ordering mechanism to identify informative features. Using the results, an improved method is suggested to extract the most informative feature subset from the data set. The method uses the sum of absolute values of the first k principal component loadings to order the features where k is a user-defined application-specific value. It also applies a sequential feature selection method to extract the best subset of features. We further compare the performance of the proposed feature selection method with results from the existing Recursive Feature Elimination (RFE) by simulating data for several practical scenarios with a different number of informative features and different imbalance rates. We also validate the method using a real-world application on several classification methods. The results based on the accuracy measures indicate that the proposed approach performs better than the existing feature selection methods.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Assessing feature selection method performance with class imbalance data

Abstract

Published Version

Talk to us

Similar Papers

More From: Machine Learning with Applications

Lead the way for us

Similar Papers

A new feature selection method to improve the document clustering using particle swarm optimization algorithm
Laith Mohammad Abualigah ... Essam Said Hanandeh
Journal of Computational Science | VOL. 25
Laith Mohammad Abualigah, et. al.Laith Mohammad Abualigah ... Essam Said Hanandeh
06 Sep 2017
Journal of Computational Science | VOL. 25

Cost-sensitive feature selection using random forest: Selecting low-cost subsets of informative features
Qifeng Zhou ... Tao Li
Knowledge-Based Systems | VOL. 95
Qifeng Zhou, et. al.Qifeng Zhou ... Tao Li
11 Dec 2015
Knowledge-Based Systems | VOL. 95

Influence of Feature Selection Methods on Breast Cancer Early Prediction Phase using Classification and Regression Tree
Asma Agaal ... Mansour Essgaer
-
Asma Agaal, et. al.Asma Agaal ... Mansour Essgaer
04 Jul 2022
04 Jul 2022

Survival Prediction and Feature Selection in Patients with Breast Cancer Using Support Vector Regression.
Shahrbanoo Goli ... Hoda Mashayekhi
Computational and Mathematical Methods in Medicine | VOL. 2016
Shahrbanoo Goli, et. al.Shahrbanoo Goli ... Hoda Mashayekhi
01 Jan 2015
Computational and Mathematical Methods in Medicine | VOL. 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Assessing feature selection method performance with class imbalance data

Abstract

Published Version

Talk to us

Similar Papers

More From: Machine Learning with Applications