Abstract
BackgroundDespite the rapid progress of protein residue contact prediction, predicted residue contact maps frequently contain many errors. However, information of residue pairing in β strands could be extracted from a noisy contact map, due to the presence of characteristic contact patterns in β-β interactions. This information may benefit the tertiary structure prediction of mainly β proteins. In this work, we propose a novel ridge-detection-based β-β contact predictor to identify residue pairing in β strands from any predicted residue contact map.ResultsOur algorithm RDb2C adopts ridge detection, a well-developed technique in computer image processing, to capture consecutive residue contacts, and then utilizes a novel multi-stage random forest framework to integrate the ridge information and additional features for prediction. Starting from the predicted contact map of CCMpred, RDb2C remarkably outperforms all state-of-the-art methods on two conventional test sets of β proteins (BetaSheet916 and BetaSheet1452), and achieves F1-scores of ~ 62% and ~ 76% at the residue level and strand level, respectively. Taking the prediction of the more advanced RaptorX-Contact as input, RDb2C achieves impressively higher performance, with F1-scores reaching ~ 76% and ~ 86% at the residue level and strand level, respectively. In a test of structural modeling using the top 1 L predicted contacts as constraints, for 61 mainly β proteins, the average TM-score achieves 0.442 when using the raw RaptorX-Contact prediction, but increases to 0.506 when using the improved prediction by RDb2C.ConclusionOur method can significantly improve the prediction of β-β contacts from any predicted residue contact maps. Prediction results of our algorithm could be directly applied to effectively facilitate the practical structure prediction of mainly β proteins.AvailabilityAll source data and codes are available at http://166.111.152.91/Downloads.html or the GitHub address of https://github.com/wzmao/RDb2C.
Highlights
Despite the rapid progress of protein residue contact prediction, predicted residue contact maps frequently contain many errors
Given the original predicted contact map and extracted ridge information, we developed a novel multi-stage random forest framework to further refine the prediction of β–β contacts
Ridge-Detectionbased β-β Contact predictor (RDb2C) starts from a residue contact map predicted based on the amino acid sequence of the target protein, e.g. by CCMpred or by RaptorX-Contact
Summary
Despite the rapid progress of protein residue contact prediction, predicted residue contact maps frequently contain many errors. Many early residue contact prediction methods were derived from statistics and information theory, like OMES [10], MI [11], MIp [12] and SCA [13] These methods ignore the transitive correlation between residues and generate many false positive results. The inverse covariance matrix and pseudo-likelihood maximization were introduced subsequently to eliminate transitivity in methods such as DCA [14], PSICOV [15], plmDCA [16], GREMLIN [17], CCMpred [18], FreeContact [19] and PconsC2 [20] These methods effectively reduce false positive predictions by globally considering all inter-residue correlations. In the latest CASP12 competition, RaptorX-Contact achieved the best performance in the category of template-free modeling targets
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.