Abstract
Gene expression profiling is a useful technique for analyzing cellular function, and gene expression profiles are widely studied in human cancer research. Gene expression data usually consist of a very large number of features and a relatively small number of samples, and extracting a small number of important features from these data is a major challenge of gene expression-based analysis in cancer research. In this paper, we propose an embedded feature selection algorithm using boosted linear regression-based feature selection. The boosting technique is applied to derive the ensemble feature selector and improve the performance of linear regression-based feature selection. The proposed feature selection algorithm, called boosted regression-based feature selection for the multilayer perceptron (BREG-MLP), repeats the boosted feature selection process to extract the smallest feature subset while maintaining good classification performance. We apply the proposed BREG-MLP to some human cancer-related gene expression data sets for the purpose of extracting important features, and we confirm that BREG-MLP offers improved performance compared to single regression-based feature selection methods.
Highlights
Gene expression profiling is useful for understanding cellular function by visualizing the expression patterns of thousands of genes at the transcription level at specific times
We propose an embedded feature selection algorithm based on linear regression and neural network called boosted regression-based feature selection for the multilayer perceptron (BREG-MLP)
A linear regression method called the least absolute shrinkage and selection operator (LASSO) and MLP configuration were integrated for embedded feature selection, and the feature selection procedure was repeatedly performed to improve the performance of feature selection [12]
Summary
Gene expression profiling is useful for understanding cellular function by visualizing the expression (activity) patterns of thousands of genes at the transcription level at specific times. A method for identification of cancer types based on gene expression data was proposed in [4]. In regards to gene expression profile analysis, dimension reduction makes it difficult to identify genes that are important in terms of the biological pathways associated with cancer. Feature selection methods for finding cancer-related genes without changing raw expression data have been widely studied [6]. Cho: Cancer-Related Gene Signature Selection Based on Boosted Regression for MLP. We propose an embedded feature selection algorithm based on linear regression and neural network called boosted regression-based feature selection for the multilayer perceptron (BREG-MLP). Linear regression analysis is applied to extract important features without changing the raw data. For six different human cancer-related gene expression profiles, the proposed BREG-MLP is applied to extract gene signatures.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.