Abstract
Much progress has been made in Protein structure prediction during the last few decades. As the predicted models can span a broad range of accuracy spectrum, the accuracy of quality estimation becomes one of the key elements of successful protein structure prediction. Over the past years, a number of methods have been developed to address this issue, and these methods could be roughly divided into three categories: the single-model methods, clustering-based methods and quasi single-model methods. In this study, we develop a single-model method MQAPRank based on the learning-to-rank algorithm firstly, and then implement a quasi single-model method Quasi-MQAPRank. The proposed methods are benchmarked on the 3DRobot and CASP11 dataset. The five-fold cross-validation on the 3DRobot dataset shows the proposed single model method outperforms other methods whose outputs are taken as features of the proposed method, and the quasi single-model method can further enhance the performance. On the CASP11 dataset, the proposed methods also perform well compared with other leading methods in corresponding categories. In particular, the Quasi-MQAPRank method achieves a considerable performance on the CASP11 Best150 dataset.
Highlights
Much progress has been made in Protein structure prediction during the last few decades
As the predicted models can span a broad range of accuracy spectrum, the accuracy of quality estimation becomes one of the key elements of successful protein structure prediction
The Quasi-MQAPRank method achieves a considerable performance on the CASP11 Best[150] dataset
Summary
Much progress has been made in Protein structure prediction during the last few decades. Previous studies have found that clustering-based methods generally outperform single-model methods when numerous models are available from several different structure prediction methods[15,16], which is confirmed by the CASP (Critical Assessment of protein Structure Prediction)[4,5,17]. The problem of decoy model quality assessment could be deemed as ranking the decoy models based on their similarities to the corresponding native structure. These similarities can be measured by various structural alignment scoring methods, such as GDT_TS score (global distance test total score)[1], TM-score[25], Max-sub score[26], LGA score[27] etc. In view of its good performance, learning-to-rank methods have been applied in many bioinformatics tasks including disease name normalization[33], biomedical document retrieval[34], gene summary extraction[35], protein folding energy designing[36], etc
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.