Abstract

BackgroundIn structural biology area, protein residue-residue contacts play a crucial role in protein structure prediction. Some researchers have found that the predicted residue-residue contacts could effectively constrain the conformational search space, which is significant for de novo protein structure prediction. In the last few decades, related researchers have developed various methods to predict residue-residue contacts, especially, significant performance has been achieved by using fusion methods in recent years. In this work, a novel fusion method based on rank strategy has been proposed to predict contacts. Unlike the traditional regression or classification strategies, the contact prediction task is regarded as a ranking task. First, two kinds of features are extracted from correlated mutations methods and ensemble machine-learning classifiers, and then the proposed method uses the learning-to-rank algorithm to predict contact probability of each residue pair.ResultsFirst, we perform two benchmark tests for the proposed fusion method (RRCRank) on CASP11 dataset and CASP12 dataset respectively. The test results show that the RRCRank method outperforms other well-developed methods, especially for medium and short range contacts. Second, in order to verify the superiority of ranking strategy, we predict contacts by using the traditional regression and classification strategies based on the same features as ranking strategy. Compared with these two traditional strategies, the proposed ranking strategy shows better performance for three contact types, in particular for long range contacts. Third, the proposed RRCRank has been compared with several state-of-the-art methods in CASP11 and CASP12. The results show that the RRCRank could achieve comparable prediction precisions and is better than three methods in most assessment metrics.ConclusionsThe learning-to-rank algorithm is introduced to develop a novel rank-based method for the residue-residue contact prediction of proteins, which achieves state-of-the-art performance based on the extensive assessment.

Highlights

  • In structural biology area, protein residue-residue contacts play a crucial role in protein structure prediction

  • Performance improvements on CASP11 dataset The contact prediction task is formulated as a ranking task by the proposed method, Residue-Residue Contact prediction by learning-to-Rank (RRCRank)

  • The RRCRank uses learning-to-rank method to sort each residue pair according to its contact probability

Read more

Summary

Introduction

Protein residue-residue contacts play a crucial role in protein structure prediction. Two kinds of features are extracted from correlated mutations methods and ensemble machine-learning classifiers, and the proposed method uses the learning-to-rank algorithm to predict contact probability of each residue pair. Researchers have developed various methods (such as fragment-based assembly methods and molecular dynamics simulation methods) to model structures with lowest free energy for certain protein sequences. In order to predict protein contacts accurately, related researchers have developed many methods since the 1990s These methods could be classified into five kinds: correlated mutations methods, machinelearning methods, fusion methods, template-based methods and 3D model-based methods. Considering that the protein contacts are mainly used for protein structure prediction, 3D model-based methods have limited use in most cases

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.