Abstract

Motivation: Recent developments of statistical techniques to infer direct evolutionary couplings between residue pairs have rendered covariation-based contact prediction a viable means for accurate 3D modelling of proteins, with no information other than the sequence required. To extend the usefulness of contact prediction, we have designed a new meta-predictor (MetaPSICOV) which combines three distinct approaches for inferring covariation signals from multiple sequence alignments, considers a broad range of other sequence-derived features and, uniquely, a range of metrics which describe both the local and global quality of the input multiple sequence alignment. Finally, we use a two-stage predictor, where the second stage filters the output of the first stage. This two-stage predictor is additionally evaluated on its ability to accurately predict the long range network of hydrogen bonds, including correctly assigning the donor and acceptor residues.Results: Using the original PSICOV benchmark set of 150 protein families, MetaPSICOV achieves a mean precision of 0.54 for top-L predicted long range contacts—around 60% higher than PSICOV, and around 40% better than CCMpred. In de novo protein structure prediction using FRAGFOLD, MetaPSICOV is able to improve the TM-scores of models by a median of 0.05 compared with PSICOV. Lastly, for predicting long range hydrogen bonding, MetaPSICOV-HB achieves a precision of 0.69 for the top-L/10 hydrogen bonds compared with just 0.26 for the baseline MetaPSICOV.Availability and implementation: MetaPSICOV is available as a freely available web server at http://bioinf.cs.ucl.ac.uk/MetaPSICOV. Raw data (predicted contact lists and 3D models) and source code can be downloaded from http://bioinf.cs.ucl.ac.uk/downloads/MetaPSICOV.Contact: d.t.jones@ucl.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online.

Highlights

  • Mean precision values for the top-L, top-L/2, top-L/5 and top-L/10 contacts, L= length of target protein, at different sequence separation ranges i-j ≥ 5 and i-j ≥ 23, where the Cβ-Cβ distance < 8 Ǻ

  • Comparison of model TM-score for FRAGFOLD generated models using contacts from a) MetaPSICOV stage 1 and PSICOV b) MetaPSICOV stage 2 and PSICOV c) MetaPSICOV stage 1 and MetaPSICOV stage 2

  • TM scores for representative models extracted from FRAGFOLD ensembles for each of the 150 benchmark protein chains

Read more

Summary

Introduction

Mean precision values for the top-L, top-L/2, top-L/5 and top-L/10 contacts, L= length of target protein, at different sequence separation ranges i-j ≥ 5 and i-j ≥ 23, where the Cβ-Cβ distance < 8 Ǻ.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.