Abstract

Rare genetic disorders, which can now be studied systematically with affordable genome sequencing, are often caused by high-penetrance rare variants. Such disorders are often heterogeneous and characterized by abnormalities spanning multiple organ systems ascertained with variable clinical precision. Existing methods for identifying genes with variants responsible for rare diseases summarize phenotypes with unstructured binary or quantitative variables. The Human Phenotype Ontology (HPO) allows composite phenotypes to be represented systematically but association methods accounting for the ontological relationship between HPO terms do not exist. We present a Bayesian method to model the association between an HPO-coded patient phenotype and genotype. Our method estimates the probability of an association together with an HPO-coded phenotype characteristic of the disease. We thus formalize a clinical approach to phenotyping that is lacking in standard regression techniques for rare disease research. We demonstrate the power of our method by uncovering a number of true associations in a large collection of genome-sequenced and HPO-coded cases with rare diseases.

Highlights

  • There is widespread interest in the study of rare diseases as a way of understanding the genetic architecture of biological processes

  • Each term was selected with a pre-specified probability r, termed ‘‘expressivity,’’ and m further noise terms drawn at random from a set of approximately 1,000 Human Phenotype Ontology (HPO) terms were appended, where m ~ Poisson(l 1⁄4 5)

  • A degree of genetic heterogeneity is built into our simulation setup, because there is a non-zero probability of a template phenotype term being randomly allocated to an individual with the common genotype

Read more

Summary

Introduction

There is widespread interest in the study of rare diseases as a way of understanding the genetic architecture of biological processes. To discover the cause of disease, these subjects would ideally be grouped a priori into clusters with a shared (though unknown) genetic etiology, but this is often hindered by extensive phenotypic and genetic heterogeneity (see Web Resources and examples[1,2,3,4,5,6,7,8,9]). Even those accounting for some degree of genetic heterogeneity, typically summarize the clinical manifestations of a disease with a single variable,[10] which can limit power when multiple phenotypic traits contain complementary information about the same causal genotype. Methods for modeling pleiotropy have proven successful in the context of genome-wide association studies[11,12] but they are ill suited for rare disease studies in which the phenotype data are typically of mixed type and collected with variable detail and completeness. Methods exist that compare patient HPO data with HPO-coded profiles corresponding to known diseases for the purpose of differential diagnosis.[17,18] The HPO-coded profiles can be supplemented with functional gene-specific information to prioritize genes.[19,20] If genotype data are available, these and other methods[21,22] can be used to prioritize variants and potentially to suggest new causes of disease.[19,20,23] the existing approaches do not share information between individually coded patients and as such are not statistical association methods

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.