Abstract
Robust methods for identifying patterns of expression in genome-wide data are important for generating hypotheses regarding gene function. To this end, several analytic methods have been developed for detecting periodic patterns. We improve one such method, JTK_CYCLE, by explicitly calculating the null distribution such that it accounts for multiple hypothesis testing and by including non-sinusoidal reference waveforms. We term this method empirical JTK_CYCLE with asymmetry search, and we compare its performance to JTK_CYCLE with Bonferroni and Benjamini-Hochberg multiple hypothesis testing correction, as well as to five other methods: cyclohedron test, address reduction, stable persistence, ANOVA, and F24. We find that ANOVA, F24, and JTK_CYCLE consistently outperform the other three methods when data are limited and noisy; empirical JTK_CYCLE with asymmetry search gives the greatest sensitivity while controlling for the false discovery rate. Our analysis also provides insight into experimental design and we find that, for a fixed number of samples, better sensitivity and specificity are achieved with higher numbers of replicates than with higher sampling density. Application of the methods to detecting circadian rhythms in a metadataset of microarrays that quantify time-dependent gene expression in whole heads of Drosophila melanogaster reveals annotations that are enriched among genes with highly asymmetric waveforms. These include a wide range of oxidation reduction and metabolic genes, as well as genes with transcripts that have multiple splice forms.
Highlights
Rhythmic behavior is ubiquitous across the spectrum of life [1,2,3,4]
We compare all the precision-recall curves for all the methods on these data via the area under the receiver operating characteristic (AUROC), a measure of the sensitivity and specificity of the rhythm detection methods that does not depend on the proportions of positives and negatives in the dataset
We use it to further assess the importance of considering asymmetric waveforms, and we explore how multiple hypothesis correction impacts the results when the true positives represent a relatively small fraction of the simulated time series, as we expect to be the case in genome-wide studies
Summary
Rhythmic behavior is ubiquitous across the spectrum of life [1,2,3,4]. Diverse fundamental biological functions such as cell division, energy metabolism, and sleep are periodic, and a growing body of evidence implicates temporal dysregulation as a contributing factor to depression, neurodegeneration, cardiovascular disease, and metabolic disorders in higher organisms [5,6,7,8,9]. The most well-studied periodic patterns are circadian rhythms: oscillatory changes in gene expression, metabolism, physiology, and behavior with approximately 24-hour (24 h) periods that enable organisms to anticipate and respond to daily changes in their environment, such as nutrient accessibility, temperature, and light [10,11,12,13]. The components of the core clock are well characterized and are strongly conserved across a wide range of species [14, 15]. It remains to be determined how this clock couples to other molecular processes. Previous work suggests that hundreds, possibly thousands, of genes are regulated by circadian clocks [10, 15, 18]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.