Hybrid-Recursive Feature Elimination for Efficient Feature Selection

Hyelynn Jeon,Sejong Oh

doi:10.3390/app10093211

Abstract

As datasets continue to increase in size, it is important to select the optimal feature subset from the original dataset to obtain the best performance in machine learning tasks. Highly dimensional datasets that have an excessive number of features can cause low performance in such tasks. Overfitting is a typical problem. In addition, datasets that are of high dimensionality can create shortages in space and require high computing power, and models fitted to such datasets can produce low classification accuracies. Thus, it is necessary to select a representative subset of features by utilizing an efficient selection method. Many feature selection methods have been proposed, including recursive feature elimination. In this paper, a hybrid-recursive feature elimination method is presented which combines the feature-importance-based recursive feature elimination methods of the support vector machine, random forest, and generalized boosted regression algorithms. From the experiments, we confirm that the performance of the proposed method is superior to that of the three single recursive feature elimination methods.

Highlights

As datasets continue to increase in size, it is important to select the optimal subset of features from a raw dataset in order to obtain the best possible performance in a given machine learning task
We propose a new feature selection method—Hybrid-Recursive feature elimination (RFE)—that is an ensemble of the feature evaluation methods of support vector machine (SVM)-RFE, random forests (RFs)-RFE, and gradient boosting machines (GBMs)-RFE, combining their feature weighting functions
Each feature contains the gene expression values of a specific gene. These are typical datasets that require feature selection as they contain more than 20,000 features

Summary

Introduction

As datasets continue to increase in size, it is important to select the optimal subset of features from a raw dataset in order to obtain the best possible performance in a given machine learning task.An efficient and small feature (variable) subset is especially important for building a classification model. As datasets continue to increase in size, it is important to select the optimal subset of features from a raw dataset in order to obtain the best possible performance in a given machine learning task. High-dimensionality datasets can cause overfitting problems, in which case a reliable model cannot be obtained. Such datasets require high computing power and large volumes of storage space [1], and often produce models with low classification accuracy. This is called the “curse of dimensionality” [2]. It is necessary to select a representative subset of features to solve these problems

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: May 4, 2020
Citations: 87	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Hybrid-Recursive Feature Elimination for Efficient Feature Selection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Enhanced recursive feature elimination
Xue-Wen Chen ... Jong Cheol Jeong
-
Xue-Wen Chen, et. al.Xue-Wen Chen ... Jong Cheol Jeong
01 Dec 2007
01 Dec 2007

Semantic Partition Based Association Rule Mining across Multiple Databases Using Abstraction
P Santhi Thilagam ... Ananthanarayana V.S
-
P Santhi Thilagam, et. al.P Santhi Thilagam ... Ananthanarayana V.S
01 Dec 2007
01 Dec 2007

Estimation of Forest Stock Volume Using Sentinel-2 MSI, Landsat 8 OLI Imagery and Forest Inventory Data
Yangyang Zhou ... Zhongke Feng
Forests | VOL. 14
Yangyang Zhou, et. al.Yangyang Zhou ... Zhongke Feng
29 Jun 2023
Forests | VOL. 14

SVM margin-based feature elimination applied to high-dimensional microarray gene expression data
Yanxin Zhang ... David J Miller
-
Yanxin Zhang, et. al. Yanxin Zhang ... David J Miller
01 Oct 2008
01 Oct 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hybrid-Recursive Feature Elimination for Efficient Feature Selection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences