Abstract

BackgroundIdentification of protein complexes is crucial for understanding principles of cellular organization and functions. As the size of protein-protein interaction set increases, a general trend is to represent the interactions as a network and to develop effective algorithms to detect significant complexes in such networks.ResultsBased on the study of known complexes in protein networks, this paper proposes a new topological structure for protein complexes, which is a combination of subgraph diameter (or average vertex distance) and subgraph density. Following the approach of that of the previously proposed clustering algorithm DPClus which expands clusters starting from seeded vertices, we present a clustering algorithm IPCA based on the new topological structure for identifying complexes in large protein interaction networks. The algorithm IPCA is applied to the protein interaction network of Sacchromyces cerevisiae and identifies many well known complexes. Experimental results show that the algorithm IPCA recalls more known complexes than previously proposed clustering algorithms, including DPClus, CFinder, LCMA, MCODE, RNSC and STM.ConclusionThe proposed algorithm based on the new topological structure makes it possible to identify dense subgraphs in protein interaction networks, many of which correspond to known protein complexes. The algorithm is robust to the known high rate of false positives and false negatives in data from high-throughout interaction techniques. The program is available at .

Highlights

  • Identification of protein complexes is crucial for understanding principles of cellular organization and functions

  • Following the general approach of expanding clusters started with seeded vertices, as what DPClus did, we develop an algorithm IPCA for detecting protein complexes based on the new topological structure

  • We discuss the effect of the value Tin on clustering, compare the predicted clusters with the known complexes, evaluate the significance of the predicted clusters, and analyze the robustness and efficiency of the algorithm IPCA

Read more

Summary

Introduction

Identification of protein complexes is crucial for understanding principles of cellular organization and functions. As the size of protein-protein interaction set increases, a general trend is to represent the interactions as a network and to develop effective algorithms to detect significant complexes in such networks. Protein complexes can help us to understand certain biological progress and to predict the functions of proteins. Most proteins seem to function within complicated cellular pathways, interacting with other proteins either in pairs or as components of larger complexes [1,2]. Various methods have been used to detect protein complexes. Large-scale mass-spectrometric studies in Saccharomyces cerevisiae provide a compendium of protein complexes that are considered to play a key role in carry-

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.