Abstract

BackgroundBurgeoning interest in integrative analyses has produced a rise in studies which incorporate data from multiple genomic platforms. Literature for conducting formal hypothesis testing on an integrative gene set level is considerably sparse. This paper is biologically motivated by our interest in the joint effects of epigenetic methylation loci and their associated mRNA gene expressions on lung cancer survival status.ResultsWe provide an efficient screening approach across multiplatform genomic data on the level of biologically related sets of genes, and our methods are applicable to various disease models regardless whether the underlying true model is known (iTEGS) or unknown (iNOTE). Our proposed testing procedure dominated two competing methods. Using our methods, we identified a total of 28 gene sets with significant joint epigenomic and transcriptomic effects on one-year lung cancer survival.ConclusionsWe propose efficient variance component-based testing procedures to facilitate the joint testing of multiplatform genomic data across an entire gene set. The testing procedure for the gene set is self-contained, and can easily be extended to include more or different genetic platforms. iTEGS and iNOTE implemented in R are freely available through the inote package at https://cran.r-project.org//.

Highlights

  • Burgeoning interest in integrative analyses has produced a rise in studies which incorporate data from multiple genomic platforms

  • While methods development for non-reductive multiplatform integrative analysis has become more common in recent years [6, 7], these methods are mainly restricted to candidate gene interrogations, and do not encapsulate the highly likely network-level interactions between

  • We demonstrate the utility of our integrative testing procedures by identifying significant gene sets that can be further explored for potential biomarkers of prognosis or even therapeutic targets

Read more

Summary

Introduction

Burgeoning interest in integrative analyses has produced a rise in studies which incorporate data from multiple genomic platforms. The first is horizontal integration, where genomic data from different studies but of the same type (e.g. multiple gene- expression microarray studies) are combined, sometimes across labs, cohorts, and platforms. The second is vertical integration, where multiple levels of ’omics data (e.g. DNA variation, methylation, and gene expression) are gathered on the same subjects and analyzed. Most integrative studies employ approaches that primarily rely on dimension reduction methods to accommodate the high dimensionality of analyzing multiple platforms [4, 5]. These techniques seek to synthesize complex genetic information into summary statistics, potentially at the cost of discarding large quantities of data which might still be mechanistically informative. While methods development for non-reductive multiplatform integrative analysis has become more common in recent years [6, 7], these methods are mainly restricted to candidate gene interrogations, and do not encapsulate the highly likely network-level interactions between

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call