Abstract

Evolutionary information stored in multiple sequence alignments (MSAs) has been used to identify the interaction interface of protein complexes, by measuring either co-conservation or co-mutation of amino acid residues across the interface. Recently, maximum entropy related correlated mutation measures (CMMs) such as direct information, decoupling direct from indirect interactions, have been developed to identify residue pairs interacting across the protein complex interface. These studies have focussed on carefully selected protein complexes with large, good-quality MSAs. In this work, we study protein complexes with a more typical MSA consisting of fewer than 400 sequences, using a set of 79 intramolecular protein complexes. Using a maximum entropy based CMM at the residue level, we develop an interface level CMM score to be used in re-ranking docking decoys. We demonstrate that our interface level CMM score compares favourably to the complementarity trace score, an evolutionary information-based score measuring co-conservation, when combined with the number of interface residues, a knowledge-based potential and the variability score of individual amino acid sites. We also demonstrate, that, since co-mutation and co-complementarity in the MSA contain orthogonal information, the best prediction performance using evolutionary information can be achieved by combining the co-mutation information of the CMM with co-conservation information of a complementarity trace score, predicting a near-native structure as the top prediction for 41% of the dataset. The method presented is not restricted to small MSAs, and will likely improve interface prediction also for complexes with large and good-quality MSAs.

Highlights

  • Proteins work together as functional units known as protein complexes to perform the majority of cellular functions, and the analysis of protein-protein interactions forms an essential part of the “systems biology” enterprise

  • We first investigated if the residue level standardised correlated mutation measures (CMMs) scores could be used as predictors of inter-protein residue contacts, for our data set with small auto-generated multiple sequence alignments (MSAs)

  • We have demonstrated the predictive power of a maximum entropy-based correlated mutation measure for protein complex interaction interfaces, using proteins for which only small, Improving protein-protein interaction prediction auto-generated multiple sequence alignments exist

Read more

Summary

Introduction

Proteins work together as functional units known as protein complexes to perform the majority of cellular functions, and the analysis of protein-protein interactions forms an essential part of the “systems biology” enterprise. Proteins have to evolve in parallel with their interacting partners, to maintain the functional repertoire of protein complexes. This evolution can be traced by analysing the amino acid sequences of proteins, through the means of multiple. Improving protein-protein interaction prediction relevant newly generated data are within the paper and its Supporting Information

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call