Abstract

With recent advances in genotyping and sequencing technologies, many disease susceptibility loci have been identified. However, much of the genetic heritability remains unexplained and the replication rate between independent studies is still low. Meanwhile, there have been increasing efforts on functional annotations of the entire human genome, such as the Encyclopedia of DNA Elements (ENCODE) project and other similar projects. It has been shown that incorporating these functional annotations to prioritize genome wide association signals may help identify true association signals. However, to our knowledge, the extent of the improvement when functional annotation data are considered has not been studied in the literature. In this article, we propose a statistical framework to estimate the improvement in replication rate with annotation data, and apply it to Crohn's disease and DNase I hypersensitive sites. The results show that with cell line specific functional annotations, the expected replication rate is improved, but only at modest level.

Highlights

  • With recent advances in genotyping and sequencing technologies, many disease susceptibility loci have been identified

  • There is a need for effective computational approaches to prioritizing genome wide association studies (GWAS) results using functional annotations because 44% of trait/disease susceptibility loci documented in the NHGRI GWAS catalogue as 12/03/13 [1] are located in intergenic regions, which could be overlooked by gene centric methods, and much can be learned of the functional roles of these loci from the rapidly increasing functional annotation data of the non-coding regions

  • Among many types of functional assays in various cell lines, which are more informative for the trait/disease of interest? In other words, how to select appropriate annotation data to prioritize single nucleotide polymorphisms (SNPs) to increase replication rate in follow-up studies? Second, how much improvement, in terms of replication rate, can we expect by incorporating such information? Obviously, the answers to these questions will depend on the specific diseases to be studied and available functional annotation data

Read more

Summary

Introduction

With recent advances in genotyping and sequencing technologies, many disease susceptibility loci have been identified. The results show that with cell line specific functional annotations, the expected replication rate is improved, but only at modest level. It is critical to develop statistical methods to prioritize regions with similar association evidence to improve the replication rate in follow-up studies so as to achieve genome level statistical significance. The epigenome mapping by the Encyclopedia of DNA Elements (ENCODE) project [7] and the Roadmap Epigenomics Program [8] has generated large experimental data in a variety of human cell lines and tissues Researchers hope that these datasets would help to decipher the functional relevance of non-coding SNPs and disease etiology. We show that when appropriate functional annotation data are incorporated, the replication rate for the prioritized SNPs may be improved. Other information and approaches are needed to better prioritize SNPs with a significantly improved replication rate

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.