Abstract
BackgroundThe identification of biologically interesting genes in a temporal expression profiling dataset is challenging and complicated by high levels of experimental noise. Most statistical methods used in the literature do not fully exploit the temporal ordering in the dataset and are not suited to the case where temporal profiles are measured for a number of different biological conditions. We present a statistical test that makes explicit use of the temporal order in the data by fitting polynomial functions to the temporal profile of each gene and for each biological condition. A Hotelling T2-statistic is derived to detect the genes for which the parameters of these polynomials are significantly different from each other.ResultsWe validate the temporal Hotelling T2-test on muscular gene expression data from four mouse strains which were profiled at different ages: dystrophin-, beta-sarcoglycan and gamma-sarcoglycan deficient mice, and wild-type mice. The first three are animal models for different muscular dystrophies. Extensive biological validation shows that the method is capable of finding genes with temporal profiles significantly different across the four strains, as well as identifying potential biomarkers for each form of the disease. The added value of the temporal test compared to an identical test which does not make use of temporal ordering is demonstrated via a simulation study, and through confirmation of the expression profiles from selected genes by quantitative PCR experiments. The proposed method maximises the detection of the biologically interesting genes, whilst minimising false detections.ConclusionThe temporal Hotelling T2-test is capable of finding relatively small and robust sets of genes that display different temporal profiles between the conditions of interest. The test is simple, it can be used on gene expression data generated from any experimental design and for any number of conditions, and it allows fast interpretation of the temporal behaviour of genes. The R code is available from V.V. The microarray data have been submitted to GEO under series GSE1574 and GSE3523.
Highlights
The identification of biologically interesting genes in a temporal expression profiling dataset is challenging and complicated by high levels of experimental noise
In a typical time course microarray study, a number of microarray experiments are carried out at biologically interesting time points and across different biological conditions. It is a frequent and challenging goal to try to identify which of these genes exhibit an interesting temporal behaviour, for example whether and when a gene becomes up- or down-regulated and, more importantly, whether its behaviour is significantly different across the biological conditions of interest
Given the fact that there were many genes with expression levels gradually decreasing and increasing in time and a distinct subset of genes for which the expression peaked at the age of 8 weeks, in this paper we propose a statistical test that is more powerful to detect these patterns of expression
Summary
The identification of biologically interesting genes in a temporal expression profiling dataset is challenging and complicated by high levels of experimental noise. Introduction In a typical time course microarray study, a number of microarray experiments are carried out at biologically interesting time points and across different biological conditions. Various methods have been proposed in the literature to detect differentially expressed genes from time course microarray experiments Most of these methods aim at detecting genes whose temporal profile is significantly different from a control condition in which there is no change in expression. Amongst the well known shortcomings of this technique, many clustering methods, like the commonly used hierarchical clustering and k-means, do not make actual use of the temporal order in the data To address this problem, [2] propose a model-based clustering method for time course data, where each cluster is generated by a vector autoregressive time series model. Other model-based techniques to detect differentially expressed genes from time course microarray data include the use of linear spline functions for single gene profiles by [3] and more specific periodic functions to detect periodically expressed genes by [4] and [5]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.