Abstract

Despite exceptional experimental efforts to map out the human interactome, the continued data incompleteness limits our ability to understand the molecular roots of human disease. Computational tools offer a promising alternative, helping identify biologically significant, yet unmapped protein-protein interactions (PPIs). While link prediction methods connect proteins on the basis of biological or network-based similarity, interacting proteins are not necessarily similar and similar proteins do not necessarily interact. Here, we offer structural and evolutionary evidence that proteins interact not if they are similar to each other, but if one of them is similar to the other’s partners. This approach, that mathematically relies on network paths of length three (L3), significantly outperforms all existing link prediction methods. Given its high accuracy, we show that L3 can offer mechanistic insights into disease mechanisms and can complement future experimental efforts to complete the human interactome.

Highlights

  • Despite exceptional experimental efforts to map out the human interactome, the continued data incompleteness limits our ability to understand the molecular roots of human disease

  • To investigate the validity of the triadic closure principle (TCP) hypothesis, we measured the relative number of shared interaction partners of proteins X and Y using the Jaccard similarity J = |NX ∩ NY| / |NX ∪ NY|, where NX and NY are the interaction

  • While the problem could lie with the limitations of existing network similarity measures[21], we show that the failure of TCP is not rooted in the similarity measure we used, but it fails because it does not capture the biological principles that govern protein-protein interactions (PPIs)

Read more

Summary

Introduction

Despite exceptional experimental efforts to map out the human interactome, the continued data incompleteness limits our ability to understand the molecular roots of human disease. We offer structural and evolutionary evidence that proteins interact not if they are similar to each other, but if one of them is similar to the other’s partners This approach, that mathematically relies on network paths of length three (L3), significantly outperforms all existing link prediction methods. The increasing coverage of the interactome has inspired the development of network-based algorithms, which exploit the patterns characterizing already mapped interactions to identify missing interactions[16,17,18] Such state-of-the-art networkbased link prediction algorithms rely on the triadic closure principle (TCP)[10] (Supplementary Table 1), rooted in social network analysis, namely the observation that the more common friends two individuals have, the more likely that they know each other (neighborhood based similarity)[19,20,21]. Our results in this paper suggest that the failure of TCP is not algorithmic, but fundamental: the hypothesis that protein pairs with similar interaction partners should interact fails for most protein pairs

Objectives
Methods
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call