Ensemble feature selection for biomarker discovery in mass spectrometry-based metabolomics

Aliasghar Shahrjooihaghighi,Xiaoli Wei,Xiang Zhang,Hichem Frigui,Craig J Mcclain,Biyun Shi

doi:10.1145/3297280.3297283

Abstract

Biomarker discovery, i.e., identifying the discriminative features that are responsible for alteration of a biological system, is often solved by feature selection implemented by machine learning approaches. While many individual feature selection methods are used in biomarker discovery, the nature of omics data (small number of samples, large number of features, and noisy data) makes each of those individual feature selection algorithms unstable. In this paper, we investigate various ensemble feature selection methods to improve the reliability of the molecular biomarker selection by combining the complementary information of multiple feature selection methods. We compare the performance of different ensemble approaches and evaluate their performances using a metabolomics dataset containing three sample groups. Our results indicate that our ensemble approach outperforms the individual feature selection algorithms and provides more stable results.

Full Text