Markov clustering versus affinity propagation for the partitioning of protein interaction graphs

James Vlasblom,Shoshana J Wodak

doi:10.1186/1471-2105-10-99

James Vlasblom, Shoshana J Wodak

Open Access

https://doi.org/10.1186/1471-2105-10-99

Copy DOI

Abstract

BackgroundGenome scale data on protein interactions are generally represented as large networks, or graphs, where hundreds or thousands of proteins are linked to one another. Since proteins tend to function in groups, or complexes, an important goal has been to reliably identify protein complexes from these graphs. This task is commonly executed using clustering procedures, which aim at detecting densely connected regions within the interaction graphs. There exists a wealth of clustering algorithms, some of which have been applied to this problem. One of the most successful clustering procedures in this context has been the Markov Cluster algorithm (MCL), which was recently shown to outperform a number of other procedures, some of which were specifically designed for partitioning protein interactions graphs. A novel promising clustering procedure termed Affinity Propagation (AP) was recently shown to be particularly effective, and much faster than other methods for a variety of problems, but has not yet been applied to partition protein interaction graphs.ResultsIn this work we compare the performance of the Affinity Propagation (AP) and Markov Clustering (MCL) procedures. To this end we derive an unweighted network of protein-protein interactions from a set of 408 protein complexes from S. cervisiae hand curated in-house, and evaluate the performance of the two clustering algorithms in recalling the annotated complexes. In doing so the parameter space of each algorithm is sampled in order to select optimal values for these parameters, and the robustness of the algorithms is assessed by quantifying the level of complex recall as interactions are randomly added or removed to the network to simulate noise. To evaluate the performance on a weighted protein interaction graph, we also apply the two algorithms to the consolidated protein interaction network of S. cerevisiae, derived from genome scale purification experiments and to versions of this network in which varying proportions of the links have been randomly shuffled.ConclusionOur analysis shows that the MCL procedure is significantly more tolerant to noise and behaves more robustly than the AP algorithm. The advantage of MCL over AP is dramatic for unweighted protein interaction graphs, as AP displays severe convergence problems on the majority of the unweighted graph versions that we tested, whereas MCL continues to identify meaningful clusters, albeit fewer of them, as the level of noise in the graph increases. MCL thus remains the method of choice for identifying protein complexes from binary interaction networks.

Highlights

Genome scale data on protein interactions are generally represented as large networks, or graphs, where hundreds or thousands of proteins are linked to one another
The original version of these graphs was built from a set of 408 S. cerevisiae protein complexes hand curated in-house[28]
This graph is clearly a less challenging test for clustering procedures than protein interaction networks built from experimental data, since those networks include an appreciable level of spurious links (False Positive links)

Summary

Introduction

Genome scale data on protein interactions are generally represented as large networks, or graphs, where hundreds or thousands of proteins are linked to one another. Since proteins tend to function in groups, or complexes, an important goal has been to reliably identify protein complexes from these graphs This task is commonly executed using clustering procedures, which aim at detecting densely connected regions within the interaction graphs. Genome scale data on protein interactions are typically obtained using experimental methods for detecting binary interactions[3,4], or by affinity purifications of tagged proteins coupled to analytical methods for identifying the co-purified partners [57] These data are in general represented as large networks, or graphs where hundreds or thousands of proteins are linked to one another [8,9,10]. An important goal has been to reliably identify protein complexes from the protein interaction graphs This task is commonly carried out using graph clustering procedures, which aim at detecting densely connected regions within the interaction graphs

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Mar 30, 2009
Citations: 229	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

Markov clustering versus affinity propagation for the partitioning of protein interaction graphs

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Clustering protein-protein interaction network of TP53 tumor suppressor protein using Markov clustering algorithm
Thia Sabel Permata ... Alhadi Bustamam
-
Thia Sabel Permata, et. al.Thia Sabel Permata ... Alhadi Bustamam
01 Oct 2015
01 Oct 2015

An efficient protein complex mining algorithm based on Multistage Kernel Extension.
Xianjun Shen ... Jincai Yang
BMC Bioinformatics | VOL. Suppl 15 12
Xianjun Shen, et. al.Xianjun Shen ... Jincai Yang
06 Nov 2014
BMC Bioinformatics | VOL. Suppl 15 12

Wind Direction Division of Wind Farm Based on Spontaneous Aggregation Characteristics of Wind-direction Data
Yunhua Xi ... Yang Hu
-
Yunhua Xi, et. al.Yunhua Xi ... Yang Hu
01 Jul 2019
01 Jul 2019

Affinity Propagation on Identifying Communities in Social and Biological Networks
Caiyan Jia ... Jian Yu
-
Caiyan Jia, et. al.Caiyan Jia ... Jian Yu
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Markov clustering versus affinity propagation for the partitioning of protein interaction graphs

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics