Abstract

BackgroundAs the size of the known human interactome grows, biologists increasingly rely on computational tools to identify patterns that represent protein complexes and pathways. Previous studies have shown that densely connected network components frequently correspond to community structure and functionally related modules. In this work, we present a novel method to identify densely connected and bipartite network modules based on a log odds score for shared neighbours.ResultsTo evaluate the performance of our method (NeMo), we compare it to other widely used tools for community detection including kMetis, MCODE, and spectral clustering. We test these methods on a collection of synthetically constructed networks and the set of MIPS human complexes. We apply our method to the CXC chemokine pathway and find a high scoring functional module of 12 disconnected phospholipase isoforms.ConclusionWe present a novel method that combines a unique neighbour-sharing score with hierarchical agglomerative clustering to identify diverse network communities. The approach is unique in that we identify both dense network and dense bipartite network structures in a single approach. Our results suggest that the performance of NeMo is better than or competitive with leading approaches on both real and synthetic datasets. We minimize model complexity and generalization error in the Bayesian spirit by integrating out nuisance parameters. An implementation of our method is freely available for download as a plugin to Cytoscape through our website and through Cytoscape itself.

Highlights

  • As the size of the known human interactome grows, biologists increasingly rely on computational tools to identify patterns that represent protein complexes and pathways

  • Algorithm comparison on synthetic data To verify the effectiveness of our approach, we compare two variants of the NeMo algorithm to a selection of widely used community finding algorithms including kMetis, MCODE, and spectral clustering

  • For kMetis, MCODE, and spectral clustering, the set of putative modules for a synthetic network is taken to be the union of all modules for all parameter settings previously discussed

Read more

Summary

Introduction

As the size of the known human interactome grows, biologists increasingly rely on computational tools to identify patterns that represent protein complexes and pathways. We present a novel method to identify densely connected and bipartite network modules based on a log odds score for shared neighbours. The vast amount of molecular biology data presents us with new organizational challenges as we seek to extract knowledge from whole-genome experimental assays. BMC Bioinformatics 2010, 11(Suppl 1):S61 http://www.biomedcentral.com/1471-2105/11/S1/S61 proteins have become increasingly diverse. Some of these important assays yield protein-protein, proteinDNA, and synthetic lethal genetic interactions. Taken together these molecular interaction data sets form our picture of the known interactome. With estimates on the size of the complete protein interactome for humans and other metazoans topping 650,000 interactions [1,2,3,4], sophisticated tools are needed to cope with the complexity of biological systems

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.