Abstract

ABSTRACTGenetic heterogeneity presents a significant challenge for the identification of monogenic disease genes. Whole‐exome sequencing generates a large number of candidate disease‐causing variants and typical analyses rely on deleterious variants being observed in the same gene across several unrelated affected individuals. This is less likely to occur for genetically heterogeneous diseases, making more advanced analysis methods necessary. To address this need, we present HetRank, a flexible gene‐ranking method that incorporates interaction network data. We first show that different genes underlying the same monogenic disease are frequently connected in protein interaction networks. This motivates the central premise of HetRank: those genes carrying potentially pathogenic variants and whose network neighbors do so in other affected individuals are strong candidates for follow‐up study. By simulating 1,000 exome sequencing studies (20,000 exomes in total), we model varying degrees of genetic heterogeneity and show that HetRank consistently prioritizes more disease‐causing genes than existing analysis methods. We also demonstrate a proof‐of‐principle application of the method to prioritize genes causing Adams‐Oliver syndrome, a genetically heterogeneous rare disease. An implementation of HetRank in R is available via the Website http://sourceforge.net/p/hetrank/.

Highlights

  • Genetic heterogeneity presents a significant challenge for the identification of monogenic disease genes

  • Fig. S4), leading us to reject the null (H0) and supporting previous assertions that interacting genes are more likely to have similar phenotypic consequences [Goh et al, 2007; Feldman et al, 2008]. This makes a compelling case for the use of interaction networks as a means to identify new sources of genetic heterogeneity, given that high-throughput methods are continually improving network coverage [Yu et al, 2011]

  • Genetic heterogeneity reduces the power of exome-sequencing studies to identify the molecular basis of a monogenic disease because it limits the expected overlap of genes carrying deleterious mutations in unrelated affected individuals

Read more

Summary

Introduction

ABSTRACT: Genetic heterogeneity presents a significant challenge for the identification of monogenic disease genes. Whole-exome sequencing generates a large number of candidate disease-causing variants and typical analyses rely on deleterious variants being observed in the same gene across several unrelated affected individuals. This is less likely to occur for genetically heterogeneous diseases, making more advanced analysis methods necessary. We first show that different genes underlying the same monogenic disease are frequently connected in protein interaction networks This motivates the central premise of HetRank: those genes carrying potentially pathogenic variants and whose network neighbors do so in other affected individuals are strong candidates for follow-up study. An implementation of HetRank in R is available via the Website http://sourceforge.net/p/hetrank/

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call