Abstract
Different parts of a gene can be of differential importance to development and health. This regional heterogeneity is also apparent in the distribution of disease-associated mutations, which often cluster in particular regions of disease-associated genes. The ability to precisely estimate functionally important sub-regions of genes will be key in correctly deciphering relationships between genetic variation and disease. Previous methods have had some success using standing human variation to characterize this variability in importance by measuring sub-regional intolerance, i.e., the depletion in functional variation from expectation within a given region of a gene. However, the ability to precisely estimate local intolerance was restricted by the fact that only information within a given sub-region is used, leading to instability in local estimates, especially for small regions. We show that borrowing information across regions using a Bayesian hierarchical model stabilizes estimates, leading to lower variability and improved predictive utility. Specifically, our approach more effectively identifies regions enriched for ClinVar pathogenic variants. We also identify significant correlations between sub-region intolerance and the distribution of pathogenic variation in disease-associated genes, with AUCs for classifying de novo missense variants in Online Mendelian Inheritance in Man (OMIM) genes of up to 0.86 using exonic sub-regions and 0.91 using sub-regions defined by protein domains. This result immediately suggests that considering the intolerance of regions in which variants are found may improve diagnostic interpretation. We also illustrate the utility of integrating regional intolerance into gene-level disease association tests with a study of known disease-associated genes for epileptic encephalopathy.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.