Abstract

Genome-wide association studies have discovered a large number of genetic variants in human patients with the disease. Thus, predicting the impact of these variants is important for sorting disease-associated variants (DVs) from neutral variants. Current methods to predict the mutational impacts depend on evolutionary conservation at the mutation site, which is determined using homologous sequences and based on the assumption that variants at well-conserved sites have high impacts. However, many DVs at less-conserved but functionally important sites cannot be predicted by the current methods. Here, we present a method to find DVs at less-conserved sites by predicting the mutational impacts using evolutionary coupling analysis. Functionally important and evolutionarily coupled sites often have compensatory variants on cooperative sites to avoid loss of function. We found that our method identified known intolerant variants in a diverse group of proteins. Furthermore, at less-conserved sites, we identified DVs that were not identified using conservation-based methods. These newly identified DVs were frequently found at protein interaction interfaces, where species-specific mutations often alter interaction specificity. This work presents a means to identify less-conserved DVs and provides insight into the relationship between evolutionarily coupled sites and human DVs.

Highlights

  • As sequencing technology has advanced, many nonsynonymous variants have been identified in a number of genome-wide association studies (GWASs)

  • This variant could not be identified by conservation-based methods, such as SIFT and PROVEAN [11,12], because it occurs at a site which is less conserved in multiple sequence alignments (MSAs)

  • We found that the vector of coupling number (CN) contributes less to the separation of disease-associated variants (DVs) from common polymorphic variants (CVs), which suggests that the combination of CN and cost of coupling (CC) scores into the CE score is necessary to predict variant impacts

Read more

Summary

Introduction

As sequencing technology has advanced, many nonsynonymous variants have been identified in a number of genome-wide association studies (GWASs). The K329E variant of acyl-CoA dehydrogenase (ACADM) is associated with medium-chain acyl-CoA dehydrogenase (MCAD) deficiency, which is characterized by hypoglycemia and sudden death (MIM: 201450) [9,10]. This variant could not be identified by conservation-based methods, such as SIFT and PROVEAN [11,12], because it occurs at a site which is less conserved in multiple sequence alignments (MSAs). Another evolutionary approach is needed to identify variants of residues that are not well conserved but are associated with human disease

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.