Estimation of model accuracy by a unique set of features and tree-based regressor

Mor Bitton,Chen Keasar

doi:10.1038/s41598-022-17097-z

Mor Bitton, Chen Keasar

Open Access

https://doi.org/10.1038/s41598-022-17097-z

Copy DOI

Journal: Scientific Reports	Publication Date: Aug 18, 2022
Citations: 2	License type: open-access

Affiliation: Ben-Gurion University of the Negev

Abstract

Computationally generated models of protein structures bridge the gap between the practically negligible price tag of sequencing and the high cost of experimental structure determination. By providing a low-cost (and often free) partial alternative to experimentally determined structures, these models help biologists design and interpret their experiments. Obviously, the more accurate the models the more useful they are. However, methods for protein structure prediction generate many structural models of various qualities, necessitating means for the estimation of their accuracy. In this work we present MESHI_consensus, a new method for the estimation of model accuracy. The method uses a tree-based regressor and a set of structural, target-based, and consensus-based features. The new method achieved high performance in the EMA (Estimation of Model Accuracy) track of the recent CASP14 community-wide experiment (https://predictioncenter.org/casp14/index.cgi). The tertiary structure prediction track of that experiment revealed an unprecedented leap in prediction performance by a single prediction group/method, namely AlphaFold2. This achievement would inevitably have a profound impact on the field of protein structure prediction, including the accuracy estimation sub-task. We conclude this manuscript with some speculations regarding the future role of accuracy estimation in a new era of accurate protein structure prediction.

Full Text