Abstract

BackgroundDetecting protein complexes in protein-protein interaction (PPI) networks plays an important role in improving our understanding of the dynamic of cellular organisation. However, protein interaction data generated by high-throughput experiments such as yeast-two-hybrid (Y2H) and tandem affinity-purification/mass-spectrometry (TAP-MS) are characterised by the presence of a significant number of false positives and false negatives. In recent years there has been a growing trend to incorporate diverse domain knowledge to support large-scale analysis of PPI networks.MethodsThis paper presents a new algorithm, by incorporating Gene Ontology (GO) based semantic similarities, to detect protein complexes from PPI networks generated by TAP-MS. By taking co-complex relations in TAP-MS data into account, TAP-MS PPI networks are modelled as bipartite graph, where bait proteins consist of one set of nodes and prey proteins are on the other. Similarities between pairs of bait proteins are computed by considering both the topological features and GO-driven semantic similarities. Bait proteins are then grouped in to sets of clusters based on their pair-wise similarities to produce a set of 'seed' clusters. An expansion process is applied to each 'seed' cluster to recruit prey proteins which are significantly associated with the same set of bait proteins. Thus, completely identified protein complexes are then obtained.ResultsThe proposed algorithm has been applied to real TAP-MS PPI networks. Fifteen quality measures have been employed to evaluate the quality of generated protein complexes. Experimental results show that the proposed algorithm has greatly improved the accuracy of identifying complexes and outperformed several state-of-the-art clustering algorithms. Moreover, by incorporating semantic similarity, the proposed algorithm is more robust to noises in the networks.

Highlights

  • Detecting protein complexes in protein-protein interaction (PPI) networks plays an important role in improving our understanding of the dynamic of cellular organisation

  • The algorithm lies on the assumption that, as tandem affinity-purification/massspectrometry (TAP-MS) experiment directly detects complex membership by purifying prey proteins which are co-associated with tagged bait proteins [5,6], a protein complex is institutively composed of a set of bait proteins along with a set of prey proteins that are significantly associated with the same set of bait proteins

  • Experimental results and discussion In order to gauge the effect after incorporating the semantic similarity in clustering process, we firstly compare proposed algorithm against the BGCA [17]

Read more

Summary

Introduction

Detecting protein complexes in protein-protein interaction (PPI) networks plays an important role in improving our understanding of the dynamic of cellular organisation. Graph-based clustering algorithms are an effective approach to identify protein complexes. In 2000, Markov Clustering Algorithm (MCL) [9] was proposed for identifying complexes from protein interaction networks by simulating random walks on the graph. In 2003, Bader and Hogue [10] represented PPI networks using their proposed ‘Spoke’ model and the ‘Matrix’ model, and applied the Molecular Complex Detection (MCODE) algorithm to detecting protein complexes from the two models. In 2006, Brohée and Helden [11] carried out an evaluation on the performance of four clustering algorithms in detecting protein complexes, including MCL and MCODE. A random walk based clustering algorithm, Repeated Random Walks (RRW) [13], was proposed to identify overlapping protein complexes in PPI networks and experimental results demonstrated that RRW obtained clusters with higher precision than MCL [12]. Experimental results [14] showed that COACH achieved better performance than several existing clustering algorithms

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call