Active Semi-Supervised Community Detection Based on Must-Link and Cannot-Link Constraints

Jianjun Cheng,Mingwei Leng,Xiaoyun Chen,Hanhai Zhou,Longjie Li

doi:10.1371/journal.pone.0110088

Abstract

Community structure detection is of great importance because it can help in discovering the relationship between the function and the topology structure of a network. Many community detection algorithms have been proposed, but how to incorporate the prior knowledge in the detection process remains a challenging problem. In this paper, we propose a semi-supervised community detection algorithm, which makes full utilization of the must-link and cannot-link constraints to guide the process of community detection and thereby extracts high-quality community structures from networks. To acquire the high-quality must-link and cannot-link constraints, we also propose a semi-supervised component generation algorithm based on active learning, which actively selects nodes with maximum utility for the proposed semi-supervised community detection algorithm step by step, and then generates the must-link and cannot-link constraints by accessing a noiseless oracle. Extensive experiments were carried out, and the experimental results show that the introduction of active learning into the problem of community detection makes a success. Our proposed method can extract high-quality community structures from networks, and significantly outperforms other comparison methods.

Highlights

Community structures are significant features observed in many complex networks, meaning that the nodes in a network can be divided naturally into groups, within which connections are relatively dense but between which connections are much sparser
Active learning algorithm In this subsection, we present the idea of the proposed semisupervised component generation algorithm based on active learning
We carried out two types of experiments: one for testifying the ability of the semi-supervised community detection algorithm based on the must-link and cannot-link constraints, and the other for demonstrating the utility of the semi-supervised component-generation algorithm based on active learning

Summary

Introduction

Community structures are significant features observed in many complex networks, meaning that the nodes in a network can be divided naturally into groups, within which connections are relatively dense but between which connections are much sparser. Methods based on random walk utilize the tendency of a random walker to identify community structures from networks, the walker tends to be trapped in communities rather than walks across community boundaries within a limited number of steps. Such methods have been applied in many applications successfully [31,32,33,34,35,36,37,38]. The Infohiermap (abbreviation for Hierarchical Infomap [36]) algorithm [37], which reveals the best hierarchical community structures in networks by finding the shortest multilevel descriptions of the random walker, and the PPC (acronym for Personalized PageRank Clustering) algorithm [38], which combines the random walks and the modularity to efficiently identify the community structures of networks, are two representatives of the state-of-the-art algorithms based on random walk

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS ONE	Publication Date: Oct 17, 2014
Citations: 79	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Active Semi-Supervised Community Detection Based on Must-Link and Cannot-Link Constraints

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

Improving the Efficiency and Effectiveness of Community Detection via Prior-Induced Equivalent Super-Network
Liang Yang ... Dongxiao He
Scientific Reports | VOL. 7
Liang Yang, et. al.Liang Yang ... Dongxiao He
29 Mar 2017
Scientific Reports | VOL. 7

Semi-supervised Community Detection
Mingwei Leng ... Tao Ma
-
Mingwei Leng, et. al.Mingwei Leng ... Tao Ma
20 Dec 2019
20 Dec 2019

Cultures of the Central Highlands, New Guinea
K E Read
Southwestern Journal of Anthropology | VOL. 10
K E ReadK E Read
01 Apr 1954
Southwestern Journal of Anthropology | VOL. 10

Exploring the roles of cannot-link constraint in community detection via Multi-variance Mixed Gaussian Generative Model.
Liang Yang ... Meng Ge
PloS one | VOL. 12
Liang Yang, et. al.Liang Yang ... Meng Ge
05 Jul 2017
PloS one | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Active Semi-Supervised Community Detection Based on Must-Link and Cannot-Link Constraints

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE