On the Robustness of Graph-Based Clustering to Random Network Alterations

R Greg Stacey,Michael A Skinnider,Leonard J Foster

doi:10.1074/mcp.ra120.002275

Abstract

AbstractBiological functions emerge from complex and dynamic networks of protein–protein interactions. Because these protein–protein interaction networks, or interactomes, represent pairwise connections within a hierarchically organized system, it is often useful to identify higher-order associations embedded within them, such as multimember protein complexes. Graph-based clustering techniques are widely used to accomplish this goal, and dozens of field-specific and general clustering algorithms exist. However, interactomes can be prone to errors, especially when inferred from high-throughput biochemical assays. Therefore, robustness to network-level noise is an important criterion. Here, we tested the robustness of a range of graph-based clustering algorithms in the presence of noise, including algorithms common across domains and those specific to protein networks. Strikingly, we found that all of the clustering algorithms tested here markedly amplified network-level noise. Randomly rewiring only 1% of network edges yielded more than a 50% change in clustering results. Moreover, we found the impact of network noise on individual clusters was not uniform: some clusters were consistently robust to injected noise, whereas others were not. Therefore we developed the clust.perturb R package and Shiny web application to measure the reproducibility of clusters by randomly perturbing the network. We show that clust.perturb results are predictive of real-world cluster stability: poorly reproducible clusters as identified by clust.perturb are significantly less likely to be reclustered across experiments. We conclude that graph-based clustering amplifies noise in protein interaction networks, but quantifying the robustness of a cluster to network noise can separate stable protein complexes from spurious associations.

Highlights

Metrics for measuring cluster similarity can be misleading and unintuitive. Using an intuitive metric, clustering is very sensitive to network noise. some clusters within a set are robust to network changes. clust.perturb predicts robust clusters by randomly perturbing the network
To arrive at general findings, we analyzed two literature-curated interactomes [8, 9]; three large-scale human interactomes derived from affinity purification–mass spectrometry (AP-MS) or yeast two-hybrid (Y2H) techniques, including one weighted interactome [10,11,12]; a network of drug–drug side-effects [13]; a representative social network [14, 15]; and 28 protein–protein interaction networks derived from co-fractionation mass spectrometry experiments generated by our group [16,17,18,19]
We established a suitable metric to measure changes in clustering solutions after injection of noise into a network. We used this metric to demonstrate that clustering amplifies network noise, i.e., the ratio of network level noise to set-wise Jaccard index J is greater than 1, such that injection of a small degree of noise into a network can result in dramatic changes to its clustering

Summary

Introduction

Because these networks are composed of a list of pairwise connections (edges) between members (nodes) and do not explicitly detail higher-order associations, it can be useful to infer higher-order arrangements from the network This task, called community detection or graph-based clustering, is ubiquitous across fields and is especially important in biology, where the function of a biological macromolecule such as a protein is often mediated by its interacting partners within the network. This is especially true in biological networks constructed from highthroughput experiments, such as protein–protein interaction networks (“interactomes”) where more than half of the expected network edges may vary from experiment to experiment, either because of errors in network reconstruction or changes in experimental conditions [1] Complicating this issue is the fact that it can be surprisingly ambiguous to measure differences between sets of clusters, in part because metrics for this purpose make different choices about how to penalize false positives (incorrectly merging clusters) versus false negatives (incorrectly separating clusters).

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Molecular & cellular proteomics : MCP	Publication Date: Jan 1, 2021
Citations: 9	License type: cc-by

R Discovery Prime

R Discovery Prime

On the Robustness of Graph-Based Clustering to Random Network Alterations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Molecular & cellular proteomics : MCP

Lead the way for us

Similar Papers

Biomolecular networks and human diseases.
Fangxiang Wu ... Reda Alhajj
BioMed research international | VOL. 2014
Fangxiang Wu, et. al.Fangxiang Wu ... Reda Alhajj
01 Jan 2014
BioMed research international | VOL. 2014

Protein complex detection via weighted ensemble clustering based on Bayesian nonnegative matrix factorization.
Le Ou-Yang ... Vladimir N Uversky
PloS one | VOL. 8
Le Ou-Yang, et. al.Le Ou-Yang ... Vladimir N Uversky
02 May 2013
PloS one | VOL. 8

Fully automated protein complex prediction based on topological similarity and community structure.
Chengwei Lei ... Jianhua Ruan
Proteome science | VOL. 11
Chengwei Lei, et. al.Chengwei Lei ... Jianhua Ruan
01 Jan 2013
Proteome science | VOL. 11

DPCMNE: Detecting Protein Complexes From Protein-Protein Interaction Networks Via Multi-Level Network Embedding.
Xiangmao Meng ... Min Li
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 19
Xiangmao Meng, et. al.Xiangmao Meng ... Min Li
08 Jan 2021
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the Robustness of Graph-Based Clustering to Random Network Alterations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Molecular & cellular proteomics : MCP