Abstract
BackgroundProtein structure prediction has achieved a lot of progress during the last few decades and a greater number of models for a certain sequence can be predicted. Consequently, assessing the qualities of predicted protein models in perspective is one of the key components of successful protein structure prediction. Over the past years, a number of methods have been developed to address this issue, which could be roughly divided into three categories: single methods, quasi-single methods and clustering (or consensus) methods. Although these methods achieve much success at different levels, accurate protein model quality assessment is still an open problem.ResultsHere, we present the MQAPRank, a global protein model quality assessment program based on learning-to-rank. The MQAPRank first sorts the decoy models by using single method based on learning-to-rank algorithm to indicate their relative qualities for the target protein. And then it takes the first five models as references to predict the qualities of other models by using average GDT_TS scores between reference models and other models. Benchmarked on CASP11 and 3DRobot datasets, the MQAPRank achieved better performances than other leading protein model quality assessment methods. Recently, the MQAPRank participated in the CASP12 under the group name FDUBio and achieved the state-of-the-art performances.ConclusionsThe MQAPRank provides a convenient and powerful tool for protein model quality assessment with the state-of-the-art performances, it is useful for protein structure prediction and model quality assessment usages.
Highlights
ResultsWe present the MQAPRank, a global protein model quality assessment program based on learning-torank
Protein structure prediction has achieved a lot of progress during the last few decades and a greater number of models for a certain sequence can be predicted
ModFOLD6_cor quasi-single aBest 150: the dataset comprised of the best 150 models submitted on a target according to the benchmark consensus method. bSelect 20: the dataset comprised of 20 models spanning the whole range of server model difficulty on each target. cDiff: The average difference between the predicted and GDT_TS scores. dMCC: Matthews correlation coefficient. eAUC: The area under the ROC curve. fLoss: The loss in quality between the best available model and the predicted best model
Summary
We present the MQAPRank, a global protein model quality assessment program based on learning-torank. The MQAPRank first sorts the decoy models by using single method based on learning-to-rank algorithm to indicate their relative qualities for the target protein. It takes the first five models as references to predict the qualities of other models by using average GDT_TS scores between reference models and other models. Benchmarked on CASP11 and 3DRobot datasets, the MQAPRank achieved better performances than other leading protein model quality assessment methods. The MQAPRank participated in the CASP12 under the group name FDUBio and achieved the state-of-the-art performances
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.