Abstract

In many real applications of semi-supervised learning, the guidance provided by a human oracle might be “noisy” or inaccurate. Human annotators will often be imperfect, in the sense that they can make subjective decisions, they might only have partial knowledge of the task at hand, or they may simply complete a labeling task incorrectly due to the burden of annotation. Similarly, in the context of semi-supervised community finding in complex networks, information encoded as pairwise constraints may be unreliable or conflicting due to the human element in the annotation process. This study aims to address the challenge of handling noisy pairwise constraints in overlapping semi-supervised community detection, by framing the task as an outlier detection problem. We propose a general architecture which includes a process to “clean” or filter noisy constraints. Furthermore, we introduce multiple designs for the cleaning process which use different type of outlier detection models, including autoencoders. A comprehensive evaluation is conducted for each proposed methodology, which demonstrates the potential of the proposed architecture for reducing the impact of noisy supervision in the context of overlapping community detection.

Highlights

  • Complex networks occur in many aspects of life, from social systems to biological processes

  • Given the best-performing outlier detection models and deep embedding functions identified in Experiment 1, we assess the performance of Active semi-supervised Speaker-listener label propagation algorithm (SLPA) (AC-SLPA) community finding using each category of constraint cleaning process described in Section “Process for identifying noisy constraints” to identify the best option

  • Each table entry includes the average Normalized Mutual Information (NMI) score of AC-SLPA combined with each cleaning methods over networks with specific parameters

Read more

Summary

Introduction

Complex networks occur in many aspects of life, from social systems to biological processes. Despite their diversity, many networks share common properties and principles of organization (Boccaletti et al 2006). One essential property that helps us to understand complex networks is the idea of community structure. Finding these sets of nodes or communities provides us with three important capabilities: understanding the structures and functionalities, modeling the dynamic processes in networks, and predicting their future behaviors. Algorithms for detecting communities are unsupervised in nature That is, they rely solely on the network topology during the detection process, rather than using any prior information or training data regarding the “correct” community structure. One common issue is that these algorithms can fail to uncover groupings that accurately reflect the ground truth in a specific domain, when these communities highly overlap with one another (Ahn et al 2010)

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call