Abstract

Algorithms for finding communities in complex networks are generally unsupervised, relying solely on the structure of the network. However, these methods can often fail to uncover meaningful groupings that reflect the underlying communities in the data, particularly when they are highly overlapping. One way to improve these algorithms is by incorporating human expertise or background knowledge in the form of pairwise constraints to direct the community detection process. In this work, we explore the potential of semi-supervised strategies to improve algorithms for finding overlapping communities in networks. We propose a method, based on label propagation, for finding communities using pairwise constraints. Furthermore, we introduce a new strategy, inspired by active learning, for intelligent constraint selection, which is designed to minimize the level of human annotation required. Extensive evaluations on synthetic and real-world datasets demonstrate the potential of this strategy for effectively uncovering meaningful overlapping community structures, using a limited amount of supervision.

Highlights

  • In many real-world applications involving machine learning, the tasks do not neatly correspond to the standard distinction between supervised and unsupervised learning

  • Firstly, we compare the accuracy of the proposed algorithms, Pairwise Constrained Speaker-listener label propagation algorithm (SLPA) (PC-SLPA) and Active Semi-supervised SLPA (AC-SLPA), to the unsupervised version of these algorithms, SLPA

  • We compare the accuracy of AC-SLPA using limited number of pairwise constraints selected by the proposed active learning-inspired approach Node Pair Selection, to Pairwise constrained (PC)-SLPA, where the pairwise constraints are selected at random

Read more

Summary

Introduction

In many real-world applications involving machine learning, the tasks do not neatly correspond to the standard distinction between supervised and unsupervised learning. In the area of network analysis, tasks such as community detection can potentially benefit from the introduction of “lightweight” supervision originating from domain experts or crowdsourced annotations This knowledge is encoded as constraints, indicating that a pair of nodes in the network should always be assigned to the same community or should never be assigned to the same community. In order to improve our ability to achieve this, and go beyond looking at connections, we could present pairs of user profiles to a human annotator (referred to as the “oracle”) and ask whether those two users should be assigned to the same community or different communities By harnessing this kind of external knowledge, we can potentially uncover communities of users, which would otherwise be difficult to identify using methods that are solely unsupervised

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.