Abstract

We characterized and evaluated the functional attributes of three yeast high-confidence protein-protein interaction data sets derived from affinity purification/mass spectrometry, protein-fragment complementation assay, and yeast two-hybrid experiments. The interacting proteins retrieved from these data sets formed distinct, partially overlapping sets with different protein-protein interaction characteristics. These differences were primarily a function of the deployed experimental technologies used to recover these interactions. This affected the total coverage of interactions and was especially evident in the recovery of interactions among different functional classes of proteins. We found that the interaction data obtained by the yeast two-hybrid method was the least biased toward any particular functional characterization. In contrast, interacting proteins in the affinity purification/mass spectrometry and protein-fragment complementation assay data sets were over- and under-represented among distinct and different functional categories. We delineated how these differences affected protein complex organization in the network of interactions, in particular for strongly interacting complexes (e.g. RNA and protein synthesis) versus weak and transient interacting complexes (e.g. protein transport). We quantified methodological differences in detecting protein interactions from larger protein complexes, in the correlation of protein abundance among interacting proteins, and in their connectivity of essential proteins. In the latter case, we showed that minimizing inherent methodology biases removed many of the ambiguous conclusions about protein essentiality and protein connectivity. We used these findings to rationalize how biological insights obtained by analyzing data sets originating from different sources sometimes do not agree or may even contradict each other. An important corollary of this work was that discrepancies in biological insights did not necessarily imply that one detection methodology was better or worse, but rather that, to a large extent, the insights reflected the methodological biases themselves. Consequently, interpreting the protein interaction data within their experimental or cellular context provided the best avenue for overcoming biases and inferring biological knowledge.

Highlights

  • The collection of proteins and protein assemblies in a cell constitutes a vital and integral part of the machinery required to sustain all cellular functions and processes [1]

  • We investigated three high-confidence high-throughput protein-protein interaction data sets (AP/MS, protein-fragment complementation assay (PCA), and Y2H), an unfiltered, raw interaction data set, and the manually curated binary gold standard (BGS) set which focuses on binary protein interactions

  • We first discuss the classification of detected proteins and their interactions based on the detection methodology, and we highlight the impact of the observed differences in the biological properties of the interacting proteins

Read more

Summary

Introduction

The collection of proteins and protein assemblies in a cell constitutes a vital and integral part of the machinery required to sustain all cellular functions and processes [1]. Because of the large number of potential protein interactions, high-throughput technologies are essential for generating whole-cell maps of these interactions [2, 3]. The Y2H and PCA techniques detect binary interactions, whereas the AP/MS techniques purify and identify protein complexes. The AP/MS uses tagged bait proteins to bind to prey proteins in the native cellular environment, followed by affinity purification and mass spectrometry detection of proteins, both Y2H and PCA rely on separate protein complementation schemes to report on whether a protein pair is interacting. Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call