A longitudinal feature selection method identifies relevant genes to distinguish complicated injury and uncomplicated injury over time

Suyan Tian,Chi Wang,Howard H Chang

doi:10.1186/s12911-018-0685-8

Abstract

BackgroundFeature selection and gene set analysis are of increasing interest in the field of bioinformatics. While these two approaches have been developed for different purposes, we describe how some gene set analysis methods can be utilized to conduct feature selection.MethodsWe adopted a gene set analysis method, the significance analysis of microarray gene set reduction (SAMGSR) algorithm, to carry out feature selection for longitudinal gene expression data.ResultsUsing a real-world application and simulated data, it is demonstrated that the proposed SAMGSR extension outperforms other relevant methods. In this study, we illustrate that a gene’s expression profiles over time can be regarded as a gene set and then a suitable gene set analysis method can be utilized directly to select relevant genes associated with the phenotype of interest over time.ConclusionsWe believe this work will motivate more research to bridge feature selection and gene set analysis, with the development of novel algorithms capable of carrying out feature selection for longitudinal gene expression data.

Highlights

Feature selection and gene set analysis are of increasing interest in the field of bioinformatics
In terms of computing time, a single run of the simple significance analysis of microarray gene set reduction (SAMGSR) algorithms takes 4.03 min on a Mac Pro equipped with a 2.2 GHZ dual-core processor and 16GB RAM
Using a real-world application, we showed that the longitudinal SAMGSR method is superior to other relevant algorithms

Summary

Introduction

Feature selection and gene set analysis are of increasing interest in the field of bioinformatics. While pathway analysis aims to identify relevant pathways/gene sets associated with a phenotype of interest, feature selection mainly focuses on the identification of relevant individual genes. These two tools seem to be parallel to each other. The statistical approach typically employed to analyze longitudinal omics data is to stratify the data into separate time points and tackle them separately This naïve strategy is inefficient given the highly dependent structure between the measures of same subject over time is erroneously ignored, leading to a huge loss of statistical power and a failure to detect incremental changes in gene expression patterns over time [6,7,8]. The separate applications of cross-sectional feature selection methods (where the gene expression values were measured at a single time point) are ineffective for longitudinal microarray data [8]

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Medical Informatics and Decision Making	Publication Date: Dec 1, 2018
Citations: 7	License type: open-access

R Discovery Prime

R Discovery Prime

A longitudinal feature selection method identifies relevant genes to distinguish complicated injury and uncomplicated injury over time

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making

Lead the way for us

Similar Papers

To select relevant features for longitudinal gene expression data by extending a pathway analysis method
Suyan Tian ... Hulin Wu
F1000Research | VOL. 7
Suyan Tian, et. al.Suyan Tian ... Hulin Wu
17 Aug 2018
F1000Research | VOL. 7

To select relevant features for longitudinal gene expression data by extending a pathway analysis method.
Suyan Tian ... Howard H Chang
F1000Research | VOL. 7
Suyan Tian, et. al.Suyan Tian ... Howard H Chang
31 Jul 2018
F1000Research | VOL. 7

Linear combination test for gene set analysis of a continuous phenotype
Irina Dinu ... Saumyadipta Pyne
BMC Bioinformatics | VOL. 14
Irina Dinu, et. al.Irina Dinu ... Saumyadipta Pyne
01 Jul 2013
BMC Bioinformatics | VOL. 14

Identification of Genes Discriminating Multiple Sclerosis Patients from Controls by Adapting a Pathway Analysis Method.
Lei Zhang ... Linlin Wang
PLOS ONE | VOL. 11
Lei Zhang, et. al.Lei Zhang ... Linlin Wang
15 Nov 2016
PLOS ONE | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A longitudinal feature selection method identifies relevant genes to distinguish complicated injury and uncomplicated injury over time

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making