Abstract

Motivation: Gene-expression data obtained from high throughput technologies are subject to various sources of noise and accordingly the raw data are pre-processed before formally analyzed. Normalization of the data is a key pre-processing step, since it removes systematic variations across arrays. There are numerous normalization methods available in the literature. Based on our experience, in the context of oscillatory systems, such as cell-cycle, circadian clock, etc., the choice of the normalization method may substantially impact the determination of a gene to be rhythmic. Thus rhythmicity of a gene can purely be an artifact of how the data were normalized. Since the determination of rhythmic genes is an important component of modern toxicological and pharmacological studies, it is important to determine truly rhythmic genes that are robust to the choice of a normalization method.Results: In this paper we introduce a rhythmicity measure and a bootstrap methodology to detect rhythmic genes in an oscillatory system. Although the proposed methodology can be used for any high-throughput gene expression data, in this paper we illustrate the proposed methodology using several publicly available circadian clock microarray gene-expression datasets. We demonstrate that the choice of normalization method has very little effect on the proposed methodology. Specifically, for any pair of normalization methods considered in this paper, the resulting values of the rhythmicity measure are highly correlated. Thus it suggests that the proposed measure is robust to the choice of a normalization method. Consequently, the rhythmicity of a gene is potentially not a mere artifact of the normalization method used. Lastly, as demonstrated in the paper, the proposed bootstrap methodology can also be used for simulating data for genes participating in an oscillatory system using a reference dataset.Availability: A user friendly code implemented in R language can be downloaded from http://www.eio.uva.es/~miguel/robustdetectionprocedure.html

Highlights

  • One of the major difficulties dealing with high-throughput geneexpression experiments is the noisy nature of the data (Tu et al, 2002; Klebanov and Yakovlev, 2007) that is intrinsic to each array

  • We demonstrate that the choice of normalization method has very little effect on the proposed methodology

  • For any pair of normalization methods considered in this paper, the resulting values of the rhythmicity measure are highly correlated

Read more

Summary

Introduction

One of the major difficulties dealing with high-throughput geneexpression experiments is the noisy nature of the data (Tu et al, 2002; Klebanov and Yakovlev, 2007) that is intrinsic to each array. A variety of pre-processing methods are available in literature, such as the Model-based Expression Index (MBEI) (Li and Wong, 2001), MAS 5.0 (Hubbell et al, 2002; Liu et al, 2003), and Robust Multi-array Average (RMA) (Irizarry et al, 2003b) They usually involve three distinct steps, namely, Background correction, Normalization, and Summarization (Wu, 2009). The resulting normalized expression data, and the downstream analyses, are expected to depend upon the normalization method used It is well-known that many biological processes, such as metabolic cycle (Slavov et al, 2012), cell-cycle (Rustici et al, 2004; Oliva et al, 2005; Peng et al, 2005; Barragán et al, 2015) or the circadian clock (Hughes et al, 2009) are governed by oscillatory systems consisting of numerous components that exhibit rhythmic or periodic patterns over time. One may refer to Caretta-Cartozo et al (2007)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.