PC-SENE: A node embedding based method for protein complex detection

Xiaoxia Liu,Liang Yang,Kan Xu,Jian Wang,Zhihao Yang,Shengtian Sang,Hongfei Lin,Yin Zhang,Yijia Zhang,Bo Xu,Lei Wang

doi:10.1109/bibm.2018.8621338

Abstract

With the accumulation of protein-protein interaction (PPI) datasets, various computational methods have been developed for identifying protein complexes from PPI networks. However, many exiting computational methods have their own limitations: supervised learning approaches need tedious effort for feature engineering and the quality measures used to guide the mining process of unsupervised methods have some drawbacks in reflecting the properties of a protein complex in PPI networks. In this work, we proposed a novel protein complex detection method, named PC-SENE. For given seeds, it uses alias sampling strategy based on protein node embedding similarities to select potential addable nodes, and makes use of a new conductance measure to decide whether to extend current candidate subgraph in order to find protein complexes. Intuitively, a well trained node embedding vector could preserve both the topological characteristics of the PPI network and the diversity of connectivity patterns of nodes in the network, and thus node embedding similarities can better reflect the relationship between nodes. The experimental results show the robustness and effectiveness of PC-SENE.

Full Text