Sequence co-evolution gives 3D contacts and structures of protein complexes.

Thomas A Hopf,Debora S Marks,João P G L M Rodrigues,Oliver Kohlbacher,Alexandre M J J Bonvin,Chris Sander,Charlotta P I Schärfe,Anna G Green

doi:10.7554/elife.03430

Abstract

Protein-protein interactions are fundamental to many biological processes. Experimental screens have identified tens of thousands of interactions, and structural biology has provided detailed functional insight for select 3D protein complexes. An alternative rich source of information about protein interactions is the evolutionary sequence record. Building on earlier work, we show that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We evaluate prediction performance in blinded tests on 76 complexes of known 3D structure, predict protein-protein contacts in 32 complexes of unknown structure, and demonstrate how evolutionary couplings can be used to distinguish between interacting and non-interacting protein pairs in a large complex. With the current growth of sequences, we expect that the method can be generalized to genome-wide elucidation of protein-protein interaction networks and used for interaction predictions at residue resolution.

Highlights

A large part of biological research is concerned with the identity, dynamics and specificity of protein interactions
Given that the evolutionary couplings method depends on large numbers of diverse sequences[34], some assumption must be made about which proteins interact with each other in homologous sequences in other species
To assemble the broadest possible data sets to test the approach and make predictions we take all known interacting proteins assembled in a published dataset that contains ~3500 high-‐confidence protein interactions in E. coli 5

Summary

Introduction

A large part of biological research is concerned with the identity, dynamics and specificity of protein interactions. One way to address the knowledge gap of protein interactions has been the use of hybrid, computational-‐experimental approaches that typically combine 3D structural information at varying resolutions, homology models and other methods 6, with force field-‐based approaches such as Rosetta Dock, residue cross-‐ linking and data-‐driven approaches that incorporate various sources of biological information 1,7-‐16. Most of these approaches depend on the availability of prior knowledge and many biologically relevant systems remain out of reach, as additional experimental information is sparse (e.g. membrane proteins, transient interactions and large complexes). Just a small number of key residue-‐residue contacts across a protein interface would allow computation of 3D models and provide a powerful, orthogonal approach to experiments

Methods

Results

Conclusion