Abstract

BackgroundThe post-genomic era with its wealth of sequences gave rise to a broad range of protein residue-residue contact detecting methods. Although various coevolution methods such as PSICOV, DCA and plmDCA provide correct contact predictions, they do not completely overlap. Hence, new approaches and improvements of existing methods are needed to motivate further development and progress in the field. We present a new contact detecting method, COUSCOus, by combining the best shrinkage approach, the empirical Bayes covariance estimator and GLasso.ResultsUsing the original PSICOV benchmark dataset, COUSCOus achieves mean accuracies of 0.74, 0.62 and 0.55 for the top L/10 predicted long, medium and short range contacts, respectively. In addition, COUSCOus attains mean areas under the precision-recall curves of 0.25, 0.29 and 0.30 for long, medium and short contacts and outperforms PSICOV. We also observed that COUSCOus outperforms PSICOV w.r.t. Matthew’s correlation coefficient criterion on full list of residue contacts. Furthermore, COUSCOus achieves on average 10% more gain in prediction accuracy compared to PSICOV on an independent test set composed of CASP11 protein targets. Finally, we showed that when using a simple random forest meta-classifier, by combining contact detecting techniques and sequence derived features, PSICOV predictions should be replaced by the more accurate COUSCOus predictions.ConclusionWe conclude that the consideration of superior covariance shrinkage approaches will boost several research fields that apply the GLasso procedure, amongst the presented one of residue-residue contact prediction as well as fields such as gene network reconstruction.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1400-3) contains supplementary material, which is available to authorized users.

Highlights

  • The post-genomic era with its wealth of sequences gave rise to a broad range of protein residueresidue contact detecting methods

  • We developed Contact predictiOn Using Shrinked COvariance (COUSCOus), a residueresidue contact detecting method approaching contact inference in a similar manner as Protein Sparse Inverse COVariance (PSICOV), by applying the sparse inverse covariance estimation technique introduced by Meinshausen and Bühlmann [23]

  • A residue-residue pair is hereby considered to be in contact if the two amino acids are proximal in the 3D structure, in particular if their Cβ -Cβ (Cα in the case of glycine) distance is less than 8 Å ngström (Å )

Read more

Summary

Introduction

The post-genomic era with its wealth of sequences gave rise to a broad range of protein residueresidue contact detecting methods. New approaches and improvements of existing methods are needed to motivate further development and progress in the field. We present a new contact detecting method, COUSCOus, by combining the best shrinkage approach, the empirical Bayes covariance estimator and GLasso. Conserved or slightly mutated columns indicate important protein positions for protein stability and function. Non-conserved positions may play key roles in maintaining the functionality when accompanied by compensatory mutations at other positions [1, 2]. Due to the substantial increase in sequence data in the post-genomic era, a broad range of methods have been introduced for detecting residue-residue contacts from MSAs in the past decades.

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.