Abstract
BackgroundIn recent years, successful contact prediction methods and contact-guided ab initio protein structure prediction methods have highlighted the importance of incorporating contact information into protein structure prediction methods. It is also observed that for almost all globular proteins, the quality of contact prediction dictates the accuracy of structure prediction. Hence, like many existing evaluation measures for evaluating 3D protein models, various measures are currently used to evaluate predicted contacts, with the most popular ones being precision, coverage and distance distribution score (Xd).ResultsWe have built a web application and a downloadable tool, ConEVA, for comprehensive assessment and detailed comparison of predicted contacts. Besides implementing existing measures for contact evaluation we have implemented new and useful methods of contact visualization using chord diagrams and comparison using Jaccard similarity computations. For a set (or sets) of predicted contacts, the web application runs even when a native structure is not available, visualizing the contact coverage and similarity between predicted contacts. We applied the tool on various contact prediction data sets and present our findings and insights we obtained from the evaluation of effective contact assessments. ConEVA is publicly available at http://cactus.rnet.missouri.edu/coneva/.ConclusionConEVA is useful for a range of contact related analysis and evaluations including predicted contact comparison, investigation of individual protein folding using predicted contacts, and analysis of contacts in a structure of interest.
Highlights
In recent years, successful contact prediction methods and contact-guided ab initio protein structure prediction methods have highlighted the importance of incorporating contact information into protein structure prediction methods
To study the relationship between length of the protein (L) and the quality of contacts suggested by the various contact evaluation measures, we computed Spearman’s rank correlation coefficient between the length of the protein and the evaluation measures – precision, coverage, Distance distribution score (Xd), mean false positive error, and spread – for the long-range contacts predicted in the PSICOV dataset
Spread and coverage are more correlated with the length at lesser contact selections whereas Xd is more correlated with L when we select more contacts for evaluation
Summary
Successful contact prediction methods and contact-guided ab initio protein structure prediction methods have highlighted the importance of incorporating contact information into protein structure prediction methods. It is observed that for almost all globular proteins, the quality of contact prediction dictates the accuracy of structure prediction. The success of many protein residue contact prediction methods, in the recent years, has kindled a new hope to solve the long standing problem of ab initio protein structure prediction [1,2,3,4,5,6]. When accurately predicted contacts are supplied as input to structure prediction or reconstruction methods, accurate folds can be predicted consistently [1, 7,8,9].
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have