Bayesian feature selection for classification with possibly large number of classes

Justin Davis,Marianna Pensky,William Crampton

doi:10.1016/j.jspi.2011.04.011

Abstract

In what follows, we introduce two Bayesian models for feature selection in high-dimensional data, specifically designed for the purpose of classification. We use two approaches to the problem: one which discards the components which have “almost constant” values (Model 1) and another which retains the components for which variations in-between the groups are larger than those within the groups (Model 2). We assume that p ⪢ n , i.e. the number of components p is much larger than the number of samples n, and that only few of those p components are useful for subsequent classification. We show that particular cases of the above two models recover familiar variance or ANOVA-based component selection. When one has only two classes and features are a priori independent, Model 2 reduces to the Feature Annealed Independence Rule (FAIR) introduced by Fan and Fan (2008) and can be viewed as a natural generalization of FAIR to the case of L > 2 classes. The performance of the methodology is studies via simulations and using a biological dataset of animal communication signals comprising 43 groups of electric signals recorded from tropical South American electric knife fishes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bayesian feature selection for classification with possibly large number of classes

Abstract

Talk to us

Similar Papers

More From: Journal of Statistical Planning and Inference

Lead the way for us

Journal: Journal of Statistical Planning and Inference	Publication Date: Apr 16, 2011
Citations: 7

Similar Papers

Scalable Feature Selection in High-Dimensional Data Based on GRASP
Mohsen Moshki ... Alireza Mohebalhojeh
Applied Artificial Intelligence | VOL. 29
Mohsen Moshki, et. al.Mohsen Moshki ... Alireza Mohebalhojeh
16 Mar 2015
Applied Artificial Intelligence | VOL. 29

RHDSI: A novel dimensionality reduction based algorithm on high dimensional feature selection with interactions
Rahi Jain ... Wei Xu
Information Sciences | VOL. 574
Rahi Jain, et. al.Rahi Jain ... Wei Xu
06 Jul 2021
Information Sciences | VOL. 574

The EBIC and a sequential procedure for feature selection in interactive linear models with high-dimensional data
Yawei He ... Zehua Chen
Annals of the Institute of Statistical Mathematics | VOL. 68
Yawei He, et. al.Yawei He ... Zehua Chen
03 Dec 2014
Annals of the Institute of Statistical Mathematics | VOL. 68

Ensemble learning-based filter-centric hybrid feature selection framework for high-dimensional imbalanced data
Jongmo Kim ... Mye Sohn
Knowledge-Based Systems | VOL. 220
Jongmo Kim, et. al.Jongmo Kim ... Mye Sohn
08 Mar 2021
Knowledge-Based Systems | VOL. 220

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bayesian feature selection for classification with possibly large number of classes

Abstract

Talk to us

Similar Papers

More From: Journal of Statistical Planning and Inference