An empirical evaluation of hierarchical feature selection methods for classification in bioinformatics datasets with gene ontology-based features

Cen Wan,Alex A Freitas

doi:10.1007/s10462-017-9541-y

Abstract

Hierarchical feature selection is a new research area in machine learning/data mining, which consists of performing feature selection by exploiting dependency relationships among hierarchically structured features. This paper evaluates four hierarchical feature selection methods, i.e., HIP, MR, SHSEL and GTD, used together with four types of lazy learning-based classifiers, i.e., Naive Bayes, Tree Augmented Naive Bayes, Bayesian Network Augmented Naive Bayes and k-Nearest Neighbors classifiers. These four hierarchical feature selection methods are compared with each other and with a well-known “flat” feature selection method, i.e., Correlation-based Feature Selection. The adopted bioinformatics datasets consist of aging-related genes used as instances and Gene Ontology terms used as hierarchical features. The experimental results reveal that the HIP (Select Hierarchical Information Preserving Features) method performs best overall, in terms of predictive accuracy and robustness when coping with data where the instances’ classes have a substantially imbalanced distribution. This paper also reports a list of the Gene Ontology terms that were most often selected by the HIP method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An empirical evaluation of hierarchical feature selection methods for classification in bioinformatics datasets with gene ontology-based features

Abstract

Talk to us

Similar Papers

More From: Artificial Intelligence Review

Lead the way for us

Journal: Artificial Intelligence Review	Publication Date: Jan 30, 2017
Citations: 24

Similar Papers

Two methods for constructing a gene ontology-based feature network for a Bayesian network classifier and applications to datasets of aging-related genes
Cen Wan ... Alex A Freitas
-
Cen Wan, et. al.Cen Wan ... Alex A Freitas
09 Sep 2015
09 Sep 2015

Harvestman: a framework for hierarchical feature learning and selection from whole genome sequencing data
Trevor S Frisby ... Quang Minh Hoang
BMC Bioinformatics | VOL. 22
Trevor S Frisby, et. al.Trevor S Frisby ... Quang Minh Hoang
01 Apr 2021
BMC Bioinformatics | VOL. 22

Robust hierarchical feature selection with a capped [formula omitted]-norm
Xinxin Liu ... Hong Zhao
Neurocomputing | VOL. 443
Xinxin Liu, et. al.Xinxin Liu ... Hong Zhao
10 Mar 2021
Neurocomputing | VOL. 443

Feature selection via maximizing inter-class independence and minimizing intra-class redundancy for hierarchical classification
Jie Shi ... Hong Zhao
Information Sciences | VOL. 626
Jie Shi, et. al.Jie Shi ... Hong Zhao
10 Jan 2023
Information Sciences | VOL. 626

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An empirical evaluation of hierarchical feature selection methods for classification in bioinformatics datasets with gene ontology-based features

Abstract

Talk to us

Similar Papers

More From: Artificial Intelligence Review