Abstract
BackgroundGene clustering of periodic transcriptional profiles provides an opportunity to shed light on a variety of biological processes, but this technique relies critically upon the robust modeling of longitudinal covariance structure over time.MethodologyWe propose a statistical method for functional clustering of periodic gene expression by modeling the covariance matrix of serial measurements through a general autoregressive moving-average process of order (,), the so-called ARMA(,). We derive a sophisticated EM algorithm to estimate the proportions of each gene cluster, the Fourier series parameters that define gene-specific differences in periodic expression trajectories, and the ARMA parameters that model the covariance structure within a mixture model framework. The orders and of the ARMA process that provide the best fit are identified by model selection criteria.ConclusionsThrough simulated data we show that whenever it is necessary, employment of sophisticated covariance structures such as ARMA is crucial in order to obtain unbiased estimates of the mean structure parameters and increased precision of estimation. The methods were implemented on recently published time-course gene expression data in yeast and the procedure was shown to effectively identify interesting periodic clusters in the dataset. The new approach will provide a powerful tool for understanding biological functions on a genomic scale.
Highlights
DNA microarray technologies are widely used to detect and understand genome-wide gene expression regulation and function
Through simulated data we show that whenever it is necessary, employment of sophisticated covariance structures such as ARMA is crucial in order to obtain unbiased estimates of the mean structure parameters and increased precision of estimation
Simulation Results The performance of the proposed mixture model in terms of the precision and efficiency of the parameter estimates and the model selection for the number of components have been extensively studied in [20], where the AR(1) covariance structure was considered for Si
Summary
DNA microarray technologies are widely used to detect and understand genome-wide gene expression regulation and function. Functional principal component analysis and mixture models have become popular dimension reduction tools in microarray studies to cluster genes of similar temporal patterns [6,7,8,9,10,11,12]. These methods model the time-dependent gene expression profiles based on nonparametric approaches. The proposed model models the expression profiles by a Fourier series which can be considerably more powerful in the presence of truly periodic signals while remaining robust to non-periodic signals This is illustrated in the real data analysis in the Real Data Application section below. Gene clustering of periodic transcriptional profiles provides an opportunity to shed light on a variety of biological processes, but this technique relies critically upon the robust modeling of longitudinal covariance structure over time
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.