A Sparse-Modeling Based Approach for Class Specific Feature Selection.

Davide Nardone,Angelo Ciaramella,Antonino Staiano

doi:10.7717/peerj-cs.237

Davide Nardone, Angelo Ciaramella + Show 1 more

Open Access

https://doi.org/10.7717/peerj-cs.237

Copy DOI

Journal: PeerJ. Computer science	Publication Date: Nov 18, 2019
Citations: 12	License type: CC BY 4.0

Affiliation: Parthenope University of Naples

Abstract

In this work, we propose a novel Feature Selection framework called Sparse-Modeling Based Approach for Class Specific Feature Selection (SMBA-CSFS), that simultaneously exploits the idea of Sparse Modeling and Class-Specific Feature Selection. Feature selection plays a key role in several fields (e.g., computational biology), making it possible to treat models with fewer variables which, in turn, are easier to explain, by providing valuable insights on the importance of their role, and likely speeding up the experimental validation. Unfortunately, also corroborated by the no free lunch theorems, none of the approaches in literature is the most apt to detect the optimal feature subset for building a final model, thus it still represents a challenge. The proposed feature selection procedure conceives a two-step approach: (a) a sparse modeling-based learning technique is first used to find the best subset of features, for each class of a training set; (b) the discovered feature subsets are then fed to a class-specific feature selection scheme, in order to assess the effectiveness of the selected features in classification tasks. To this end, an ensemble of classifiers is built, where each classifier is trained on its own feature subset discovered in the previous phase, and a proper decision rule is adopted to compute the ensemble responses. In order to evaluate the performance of the proposed method, extensive experiments have been performed on publicly available datasets, in particular belonging to the computational biology field where feature selection is indispensable: the acute lymphoblastic leukemia and acute myeloid leukemia, the human carcinomas, the human lung carcinomas, the diffuse large B-cell lymphoma, and the malignant glioma. SMBA-CSFS is able to identify/retrieve the most representative features that maximize the classification accuracy. With top 20 and 80 features, SMBA-CSFS exhibits a promising performance when compared to its competitors from literature, on all considered datasets, especially those with a higher number of features. Experiments show that the proposed approach may outperform the state-of-the-art methods when the number of features is high. For this reason, the introduced approach proposes itself for selection and classification of data with a large number of features and classes.

Highlights

Data analysis is the process of evaluating data, that is often subject to high-dimensional feature spaces, i.e., where data are represented in, whatever the area of study, from biology to pattern recognition to computer vision
We focus on feature selection, which is undertaken to identify discriminative features by eliminating the ones with little or no predictive information, based on certain criteria, in order to treat with data in low dimensional spaces
The classifiers used to determine the goodness of the selected feature subsets are a Support Vector Machine (SVM) with a linear kernel and parameter C = 1, a Naive Bayes, a K-Nearest Neighbors (KNN) using k = 5, and a Decision Tree

Summary

Introduction

Data analysis is the process of evaluating data, that is often subject to high-dimensional feature spaces, i.e., where data are represented in, whatever the area of study, from biology to pattern recognition to computer vision. A Sparse-Modeling Based Approach for Class Specific Feature Selection. High-dimensional feature spaces need to be lowered since its feature vectors are generally uninformative, redundant, correlated to each other and noisy. We focus on feature selection, which is undertaken to identify discriminative features by eliminating the ones with little or no predictive information, based on certain criteria, in order to treat with data in low dimensional spaces

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Sparse-Modeling Based Approach for Class Specific Feature Selection.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PeerJ. Computer science

Lead the way for us

Similar Papers

A novel feature selection scheme for high-dimensional data sets: four-Staged Feature Selection
Ayça Çakmak Pehlivanlı
Journal of Applied Statistics | VOL. 43
Ayça Çakmak PehlivanlıAyça Çakmak Pehlivanlı
12 Oct 2015
Journal of Applied Statistics | VOL. 43

Pelvic floor pressure distribution profile in urinary incontinence: a classification study with feature selection.
Adriano Carafini ... Isabel C.N Sacco
PeerJ | VOL. 7
Adriano Carafini, et. al.Adriano Carafini ... Isabel C.N Sacco
09 Dec 2019
PeerJ | VOL. 7

Feature Selection for Vocal Segmentation Using Social Emotional Optimization Algorithm
Poreddy Rajasekharreddy ... E S Gopi
-
Poreddy Rajasekharreddy, et. al.Poreddy Rajasekharreddy ... E S Gopi
01 Jan 2019
01 Jan 2019

A Class Specific Feature Selection Method for Improving the Performance of Text Classification
Venkatesh V ... Mahalaxmy S
Scalable Computing: Practice and Experience | VOL. 25
Venkatesh V, et. al.Venkatesh V ... Mahalaxmy S
24 Feb 2024
Scalable Computing: Practice and Experience | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Sparse-Modeling Based Approach for Class Specific Feature Selection.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PeerJ. Computer science