Abstract

BackgroundRecent high-throughput technologies have opened avenues for simultaneous analyses of thousands of genes. With the availability of a multitude of public databases, one can easily access multiple genomic study results where each study comprises of significance testing results of thousands of genes. Researchers currently tend to combine this genomic information from these multiple studies in the form of a meta-analysis. As the number of genes involved is very large, the classical meta-analysis approaches need to be updated to acknowledge this large-scale aspect of the data.MethodsIn this article, we discuss how application of standard theoretical null distributional assumptions of the classical meta-analysis methods, such as Fisher’s p-value combination and Stouffer’s Z, can lead to incorrect significant testing results, and we propose a robust meta-analysis method that empirically modifies the individual test statistics and p-values before combining them.ResultsOur proposed meta-analysis method performs best in significance testing among several meta-analysis approaches, especially in presence of hidden confounders, as shown through a wide variety of simulation studies and real genomic data analysis.ConclusionThe proposed meta-analysis method produces superior meta-analysis results compared to the standard p-value combination approaches for large-scale simultaneous testing in genomic experiments. This is particularly useful in studies with large number of genes where the standard meta-analysis approaches can result in gross false discoveries due to the presence of unobserved confounding variables.

Highlights

  • Recent high-throughput technologies have opened avenues for simultaneous analyses of thousands of genes

  • In this article, we have highlighted the drawbacks of the classical p-value combination methods for significance testing in large-scale genomic experiments

  • These classical p-value combination methods rely on a theoretical null distribution which can be different from the true null distribution especially in the presence of confounding variables in large observational studies

Read more

Summary

Introduction

Recent high-throughput technologies have opened avenues for simultaneous analyses of thousands of genes. With the availability of a multitude of public databases, one can access multiple genomic study results where each study comprises of significance testing results of thousands of genes. Researchers currently tend to combine this genomic information from these multiple studies in the form of a meta-analysis. In genomic experiments and association studies, metaanalysis is a popular tool for pooling results from multiple experiments and research studies to reach an overall decision. Due to the rapid progress in technology, there has been major development of high-throughput genomic assays. Huge number of available datasets in public repositories and databases have enabled researchers to assimilate large-scale genomic information from multiple studies in the form of meta-analysis [1,2,3]. Since the sample sizes of individual genomic experiments are generally small compared to the number of genes resulting in loss of power

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.