Abstract

BackgroundProtein inter-residue contact maps provide a translation and rotation invariant topological representation of a protein. They can be used as an intermediary step in protein structure predictions. However, the prediction of contact maps represents an unbalanced problem as far fewer examples of contacts than non-contacts exist in a protein structure.In this study we explore the possibility of completely eliminating the unbalanced nature of the contact map prediction problem by predicting real-value distances between residues. Predicting full inter-residue distance maps and applying them in protein structure predictions has been relatively unexplored in the past.ResultsWe initially demonstrate that the use of native-like distance maps is able to reproduce 3D structures almost identical to the targets, giving an average RMSD of 0.5Å. In addition, the corrupted physical maps with an introduced random error of ±6Å are able to reconstruct the targets within an average RMSD of 2Å.After demonstrating the reconstruction potential of distance maps, we develop two classes of predictors using two-dimensional recursive neural networks: an ab initio predictor that relies only on the protein sequence and evolutionary information, and a template-based predictor in which additional structural homology information is provided. We find that the ab initio predictor is able to reproduce distances with an RMSD of 6Å, regardless of the evolutionary content provided. Furthermore, we show that the template-based predictor exploits both sequence and structure information even in cases of dubious homology and outperforms the best template hit with a clear margin of up to 3.7Å.Lastly, we demonstrate the ability of the two predictors to reconstruct the CASP9 targets shorter than 200 residues producing the results similar to the state of the machine learning art approach implemented in the Distill server.ConclusionsThe methodology presented here, if complemented by more complex reconstruction protocols, can represent a possible path to improve machine learning algorithms for 3D protein structure prediction. Moreover, it can be used as an intermediary step in protein structure predictions either on its own or complemented by NMR restraints.

Highlights

  • Protein inter-residue contact maps provide a translation and rotation invariant topological representation of a protein

  • We report on the average root mean square deviation (RMSD) between the native and predicted distance maps obtained as outputs of the ab initio and template-based predictors

  • We use native maps extracted from 93 solved 3D structures of the CASP7 targets

Read more

Summary

Introduction

Protein inter-residue contact maps provide a translation and rotation invariant topological representation of a protein They can be used as an intermediary step in protein structure predictions. Template-based models utilize sequence and structure similarity between an unknown protein, the socalled ‘target’, and known structures, termed ‘templates’, fathomed to be homologous to the target. This category of models has become increasingly accurate in predicting the structures of globular proteins over the last years [4,5,6]. Models that use only evolutionary constraints have emerged [10,11]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call