Abstract

Detecting protein complexes from available protein-protein interaction (PPI) data will help to deeply understand the mechanism of the biological activities. In recent years, various computational methods have been developed for identifying protein complexes from PPI networks. Almost all the basic computational methods mainly depend on the association of topological analysis of PPI networks. However, most of them fail to satisfactorily capture the global and local topological structures of the PPI networks, as well as the diversity of connectivity patterns between individual nodes at the same time. To solve this problem, in this work we propose a node embedding based alias sampling extension method to detect protein complexes. More specifically, for a given set of seed nodes, it first uses the alias sampling strategy based on protein node embedding similarities to select potential addable nodes. Then it makes use of a new conductance measure, which could better quantify the likelihood of a subgraph being a protein complex, to decide whether to extend the current candidate subgraph in order to find protein complexes. Evaluated on six real yeast PPI networks, our method outperforms state-of-the-art methods in detecting protein complexes. Furthermore, the experimental results demonstrate the protein complexes predicted by our method have higher biological significance.

Highlights

  • A Protein complex is a group of proteins that physically interact with one another to organize various biological processes in the cell

  • RESULTS we introduce the evaluation metrics and compare our method against the four well-known complex detection approaches on six yeast proteinprotein interaction (PPI) networks

  • We found that our method outperforms other six state-of-the-art algorithms in identifying protein complexes

Read more

Summary

Introduction

A Protein complex is a group of proteins that physically interact with one another to organize various biological processes in the cell. The main line of the approaches for identifying protein complexes from PPI network is based on the observation of the inherent topological structures of protein complexes [4], [5]. Identifying protein complexes can be formulated as searching for subgraphs that are densely connected inside and well separated from the rest of the networks. Considering this basic idea, the detection methods for protein complexes based on machine learning and data mining have grown rapidly and become useful ways to identify protein complexes. Multiple researches have proved combining extra wellselected biological information would improve the performance of protein complex detection [6], [7]. We only discuss the methods that only use topological characteristics of the network, since the biological information could be added to most of the methods to improve the performance

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call