Abstract

Any two unrelated individuals differ by about 10,000 single amino acid variants (SAVs). Do these impact molecular function? Experimental answers cannot answer comprehensively, while state-of-the-art prediction methods can. We predicted the functional impacts of SAVs within human and for variants between human and other species. Several surprising results stood out. Firstly, four methods (CADD, PolyPhen-2, SIFT, and SNAP2) agreed within 10 percentage points on the percentage of rare SAVs predicted with effect. However, they differed substantially for the common SAVs: SNAP2 predicted, on average, more effect for common than for rare SAVs. Given the large ExAC data sets sampling 60,706 individuals, the differences were extremely significant (p-value < 2.2e-16). We provided evidence that SNAP2 might be closer to reality for common SAVs than the other methods, due to its different focus in development. Secondly, we predicted significantly higher fractions of SAVs with effect between healthy individuals than between species; the difference increased for more distantly related species. The same trends were maintained for subsets of only housekeeping proteins and when moving from exomes of 1,000 to 60,000 individuals. SAVs frozen at speciation might maintain protein function, while many variants within a species might bring about crucial changes, for better or worse.

Highlights

  • Single nucleotide variants (SNVs) constitute the most frequent form of human genetic variation[1]

  • We focus on non-synonymous SNVs, i.e. genomic variants that result in single amino acid variants (SAVs) in protein sequences

  • We found a similar over-representation of “secreted + cell membrane” in proteins with many effect SAVs when looking at the subset of all proteins with at least 4 common SAVs for which at least 30% were predicted at SNAP2-score >50

Read more

Summary

Introduction

Single nucleotide variants (SNVs) constitute the most frequent form of human genetic variation[1]. For a tiny subset of these, enough detail is available to consider all effect types (structure vs function, molecular vs process vs localization). OMIM-like SAVs are assumed to largely affect the biological process through strong effects upon structure and/or molecular function (or localization). Many of these SAVs affect the organism as a whole manifesting as disease. Computational predictions are available for all variants and inherit only some of the bias from today’s experimental techniques Both experimental and computational assays often fail to infer the impact of variation on the organism as a whole from individual SAV molecular effects

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call