A novel feature selection scheme for high-dimensional data sets: four-Staged Feature Selection

Ayça Çakmak Pehlivanlı

doi:10.1080/02664763.2015.1092112

Abstract

Classification of high-dimensional data set is a big challenge for statistical learning and data mining algorithms. To effectively apply classification methods to high-dimensional data sets, feature selection is an indispensable pre-processing step of learning process. In this study, we consider the problem of constructing an effective feature selection and classification scheme for data set which has a small number of sample size with a large number of features. A novel feature selection approach, named four-Staged Feature Selection, has been proposed to overcome high-dimensional data classification problem by selecting informative features. The proposed method first selects candidate features with number of filtering methods which are based on different metrics, and then it applies semi-wrapper, union and voting stages, respectively, to obtain final feature subsets. Several statistical learning and data mining methods have been carried out to verify the efficiency of the selected features. In order to test the adequacy of the proposed method, 10 different microarray data sets are employed due to their high number of features and small sample size.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A novel feature selection scheme for high-dimensional data sets: four-Staged Feature Selection

Abstract

Talk to us

Similar Papers

More From: Journal of Applied Statistics

Lead the way for us

Journal: Journal of Applied Statistics	Publication Date: Oct 12, 2015
Citations: 38

Similar Papers

A Sparse-Modeling Based Approach for Class Specific Feature Selection.
Davide Nardone ... Antonino Staiano
PeerJ. Computer science | VOL. 5
Davide Nardone, et. al.Davide Nardone ... Antonino Staiano
18 Nov 2019
PeerJ. Computer science | VOL. 5

Evaluating feature selection strategies for high dimensional, small sample size datasets
Abhishek Golugula ... Anant Madabhushi
-
Abhishek Golugula, et. al.Abhishek Golugula ... Anant Madabhushi
01 Aug 2011
01 Aug 2011

Distributed feature selection (DFS) strategy for microarray gene expression data to improve the classification performance
Sai Prasad Potharaju ... M Sreedevi
Clinical Epidemiology and Global Health | VOL. 7
Sai Prasad Potharaju, et. al.Sai Prasad Potharaju ... M Sreedevi
27 Apr 2018
Clinical Epidemiology and Global Health | VOL. 7

Optimal Feature Selection from High-dimensional Microarray Dataset Employing Hybrid IG-Jaya Model
Bibhuprasad Sahu ... Sujata Dash
Current Materials Science | VOL. 17
Bibhuprasad Sahu, et. al.Bibhuprasad Sahu ... Sujata Dash
01 Mar 2024
Current Materials Science | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A novel feature selection scheme for high-dimensional data sets: four-Staged Feature Selection

Abstract

Talk to us

Similar Papers

More From: Journal of Applied Statistics