Pathogenicity Prediction of Single Amino Acid Variants With Machine Learning Model Based on Protein Structural Energies.

Tzu-Hsuan Wu,Meng-Ru Shen,Hsin-Hung Chou,Sun-Yuan Hsieh,Peng-Chan Lin

doi:10.1109/tcbb.2021.3139048

Tzu-Hsuan Wu, Meng-Ru Shen + Show 3 more

Open Access

https://doi.org/10.1109/tcbb.2021.3139048

Copy DOI

Journal: IEEE/ACM transactions on computational biology and bioinformatics	Publication Date: Jan 1, 2021
Citations: 2	License type: publisher-specific, author manuscript

Abstract

The most popular tools for predicting pathogenicity of single amino acid variants (SAVs) were developed based on sequence-based techniques. SAVs may change protein structure and function. In the context of van der Waals force and disulfide bridge calculations, no method directly predicts the impact of mutations on the energies of the protein structure. Here, we combined machine learning methods and energy scores of protein structures calculated by Rosetta Energy Function 2015 to predict SAV pathogenicity. The accuracy level of our model (0.76) is higher than that of six prediction tools. Further analyses revealed that the differential reference energies, attractive energies, and solvation of polar atoms between wildtype and mutant side-chains played essential roles in distinguishing benign from pathogenic variants. These features indicated the physicochemical properties of amino acids, which were observed in 3D structures instead of sequences. We added 16 features to Rhapsody (the prediction tool we used for our data set) and consequently improved its performance. The results indicated that these energy scores were more appropriate and more detailed representations of the pathogenicity of SAVs.

Full Text