Abstract

Continuing efforts from large international consortia have made genome-wide epigenomic and transcriptomic annotation data publicly available for a variety of cell and tissue types. However, synthesis of these datasets into effective summary metrics to characterize the functional non-coding genome remains a challenge. Here, we present GenoSkyline-Plus, an extension of our previous work through integration of an expanded set of epigenomic and transcriptomic annotations to produce high-resolution, single tissue annotations. After validating our annotations with a catalog of tissue-specific non-coding elements previously identified in the literature, we apply our method using data from 127 different cell and tissue types to present an atlas of heritability enrichment across 45 different GWAS traits. We show that broader organ system categories (e.g. immune system) increase statistical power in identifying biologically relevant tissue types for complex diseases while annotations of individual cell types (e.g. monocytes or B-cells) provide deeper insights into disease etiology. Additionally, we use our GenoSkyline-Plus annotations in an in-depth case study of late-onset Alzheimer’s disease (LOAD). Our analyses suggest a strong connection between LOAD heritability and genetic variants contained in regions of the genome functional in monocytes. Furthermore, we show that LOAD shares a similar localization of SNPs to monocyte-functional regions with Parkinson’s disease. Overall, we demonstrate that integrated genome annotations at the single tissue level provide a valuable tool for understanding the etiology of complex human diseases. Our GenoSkyline-Plus annotations are freely available at http://genocanyon.med.yale.edu/GenoSkyline.

Highlights

  • Large consortia such as ENCODE [1] and Epigenomics Roadmap Project [2] have generated a rich collection of high-throughput genomic and epigenomic data, providing unprecedented opportunities to delineate functional structures in the human genome

  • As complex disease research rapidly advances, increasing evidence suggests that non-coding regulatory DNA elements may be the primary regions harboring risk variants in human complex diseases

  • We introduce GenoSkyline-Plus, a principled annotation framework to identify tissue and cell type-specific functional regions in the human genome through integration of diverse high-throughput epigenomic and transcriptomic data

Read more

Summary

Introduction

Large consortia such as ENCODE [1] and Epigenomics Roadmap Project [2] have generated a rich collection of high-throughput genomic and epigenomic data, providing unprecedented opportunities to delineate functional structures in the human genome. As complex disease research rapidly advances, evidence has emerged that disease-associated variants are enriched in regulatory DNA elements [3, 4]. Functional annotation of the non-coding genome is critical for understanding the genetic basis of human complex diseases. Categorizing the complex regulatory machinery of the genome requires integration of diverse types of annotation data as no single annotation captures all types of functional elements [5]. We have developed GenoSkyline [6], a principled framework to identify tissue-specific functional regions in the human genome through integrative analysis of various chromatin modifications. We introduce GenoSkyline-Plus, a comprehensive update of GenoSkyline that incorporates RNA sequencing and DNA methylation data into the framework and extends to 127 integrated annotation tracks covering a spectrum of human tissue and cell types

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call