Clinical genomics sequencing is rapidly expanding the number of variants that need to be functionally elucidated. Interpreting genetic variants (i.e., mutations) usually begins by identifying how they affect protein-coding sequences. Still, the three-dimensional (3D) protein molecule is rarely considered for large-scale variant analysis, nor in analyses of how proteins interact with each other and their environment. We propose a standardized approach to scoring protein surface property changes as a new dimension for functionally and mechanistically interpreting genomic variants. Further, it directs hypothesis generation for functional genomics research to learn more about the encoded protein’s function. We developed a novel method leveraging 3D structures and time-dependent simulations to score and statistically evaluate protein surface property changes. We evaluated positive controls composed of eight thermophilic versus mesophilic orthologs and variants that experimentally change the protein’s solubility, which all showed large and statistically significant differences in charge distribution (p < 0.01). We scored static 3D structures and dynamic ensembles for 43 independent variants (23 pathogenic and 20 uninterpreted) across four proteins. Focusing on the potassium ion channel, KCNK9, the average local surface potential shifts were 0.41 kBT/ec with an average p-value of 1 × 10−2. In contrast, dynamic ensemble shifts averaged 1.15 kBT/ec with an average p-value of 1 × 10−5, enabling the identification of changes far from mutated sites. This study demonstrates that an objective assessment of how mutations affect electrostatic distributions of protein surfaces can aid in interpreting genomic variants discovered through clinical genomic sequencing.
Read full abstract