Abstract

Protein–protein interaction (PPI) networks are associated with multiple types of biases partly rooted in technical limitations of the experimental techniques. Another source of bias are the different frequencies with which proteins have been studied for interaction partners. It is generally believed that proteins with a large number of interaction partners tend to be essential, evolutionarily conserved, and involved in disease. It has been repeatedly reported that proteins driving tumor formation have a higher number of PPI partners. However, it has been noticed before that the degree distribution of PPI networks is biased toward disease proteins, which tend to have been studied more often than non-disease proteins. At the same time, for many poorly characterized proteins no interactions have been reported yet. It is unclear to which extent this study bias affects the observation that cancer proteins tend to have more PPI partners. Here, we show that the degree of a protein is a function of the number of times it has been screened for interaction partners. We present a randomization-based method that controls for this bias to decide whether a group of proteins is associated with significantly more PPI partners than the proteomic background. We apply our method to cancer proteins and observe, in contrast to previous studies, no conclusive evidence for a significantly higher degree distribution associated with cancer proteins as compared to non-cancer proteins when we compare them to proteins that have been equally often studied as bait proteins. Comparing proteins from different tumor types, a more complex picture emerges in which proteins of certain cancer classes have significantly more interaction partners while others are associated with a smaller degree. For example, proteins of several hematological cancers tend to be associated with a higher number of interaction partners as expected by chance. Solid tumors, in contrast, are usually associated with a degree distribution similar to those of equally often studied random protein sets. We discuss the biological implications of these findings. Our work shows that accounting for biases in the PPI network is possible and increases the value of PPI data.

Highlights

  • Protein–protein interaction (PPI) networks are important models of the functional organization of the cell

  • Protein–protein interaction networks are associated with two types of biases: technical biases caused by limitations inherent to the experimental techniques applied to generate the PPI networks and study biases driven by the research interests guiding the selection of bait proteins tested for interaction partners

  • Studied Proteins have More Protein Interaction Partners To quantify the relation between the number of times a protein has been studied and the reported number of PPI partners, we computed the degree of each protein from the integrated PPI database HIPPIE (Schaefer et al, 2012)

Read more

Summary

Introduction

Protein–protein interaction (PPI) networks are important models of the functional organization of the cell. High error rates associated with the experimental methods applied to measure PPIs have been recognized as a major burden for completing this goal (Von Mering et al, 2002). Protein–protein interaction networks are associated with two types of biases: technical biases caused by limitations inherent to the experimental techniques applied to generate the PPI networks and study biases driven by the research interests guiding the selection of bait proteins tested for interaction partners. Examples for technical biases are the tendency of tandem affinity purification followed by mass spectrometry (TAP/MS) to detect interactions between highly abundant proteins (Von Mering et al, 2002; Björklund et al, 2008; Ivanic et al, 2009) and interactions involving small proteins under 15 kDa (Gavin et al, 2002). Yeast two-hybrid (Y2H) tends to detect interactions between protein pairs located in the nucleus (Jensen and Bork, 2008)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call