Large-scale discovery of protein interactions at residue resolution using co-evolution calculated from genomic sequences

Anna G Green,Debora S Marks,Hadeer Elhabashy,Kelly P Brock,Rohan Maddamsetti,Oliver Kohlbacher

doi:10.1038/s41467-021-21636-z

Abstract

Increasing numbers of protein interactions have been identified in high-throughput experiments, but only a small proportion have solved structures. Recently, sequence coevolution-based approaches have led to a breakthrough in predicting monomer protein structures and protein interaction interfaces. Here, we address the challenges of large-scale interaction prediction at residue resolution with a fast alignment concatenation method and a probabilistic score for the interaction of residues. Importantly, this method (EVcomplex2) is able to assess the likelihood of a protein interaction, as we show here applied to large-scale experimental datasets where the pairwise interactions are unknown. We predict 504 interactions de novo in the E. coli membrane proteome, including 243 that are newly discovered. While EVcomplex2 does not require available structures, coevolving residue pairs can be used to produce structural models of protein interactions, as done here for membrane complexes including the Flagellar Hook-Filament Junction and the Tol/Pal complex.

Highlights

Increasing numbers of protein interactions have been identified in high-throughput experiments, but only a small proportion have solved structures
The majority of the protein monomers in the E. coli proteome (3,189 out of a total of 4,391) have high-quality monomer alignments and are amenable to EVcomplex[2] (Methods). We verify that these alignments are of high quality by testing the precision of the top evolutionary couplings (ECs) for those monomers with an experimental structure, finding that 78% have reasonable precision of the top ECs (60% for the top L ECs, where L is the protein sequence length) (Supplementary Data 1, Supplementary Fig. 1)
We find that 42% of proteins in the human proteome can be aligned with medium sequence diversity in at least one domain, and 20% of proteins can be aligned with the high diversity cutoff used for E. coli in all of their domains

Summary

Introduction

Increasing numbers of protein interactions have been identified in high-throughput experiments, but only a small proportion have solved structures. There have been many experimental[5,6,7] and computational methods[8,9,10] to identify which proteins interact within an organism to scale, but the only computational methods able to determine both interactions and their precise, residue-resolution interfaces are based on coevolution Coevolutionary methods such as EVcouplings[11,12] and others[13] have been successful in determining 3D structures by leveraging the vast corpus of natural sequences using probabilistic graphical models to infer candidate pairs of interacting residues. To demonstrate the potential for eukaryotic complexes, we show successful predictions for eukaryotic-exclusive complexes including the human spliceosome

Methods

Results

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Nature Communications	Publication Date: Mar 2, 2021
Citations: 90	License type: open-access

R Discovery Prime

R Discovery Prime

Large-scale discovery of protein interactions at residue resolution using co-evolution calculated from genomic sequences

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Nature Communications

Lead the way for us

Similar Papers

PRINCESS, a Protein Interaction Confidence Evaluation System with Multiple Data Sources
Dong Li ... Fuchu He
Molecular & Cellular Proteomics | VOL. 7
Dong Li, et. al.Dong Li ... Fuchu He
01 Jun 2008
Molecular & Cellular Proteomics | VOL. 7

Integration of Heterogeneous Experimental Data Improves Global Map of Human Protein Complexes.
Jose Lugo-Martinez ... Ziv Bar-Joseph
ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine | VOL. 2019
Jose Lugo-Martinez, et. al.Jose Lugo-Martinez ... Ziv Bar-Joseph
04 Sep 2019
04 Sep 2019

Proteome-wide Prediction of Signal Flow Direction in Protein Interaction Networks Based on Interacting Domains
Wei Liu ... Fuchu He
Molecular & Cellular Proteomics | VOL. 8
Wei Liu, et. al.Wei Liu ... Fuchu He
01 Sep 2009
Molecular & Cellular Proteomics | VOL. 8

Predicting the Strongest Domain-Domain Contact in Interacting Protein Pairs
Tom M W Nye ... Sarah Teichmann
Statistical Applications in Genetics and Molecular Biology | VOL. 5
Tom M W Nye, et. al.Tom M W Nye ... Sarah Teichmann
24 Jan 2006
Statistical Applications in Genetics and Molecular Biology | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Large-scale discovery of protein interactions at residue resolution using co-evolution calculated from genomic sequences

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Nature Communications