Abstract

Nowadays, people are usually involved in multiple heterogeneous social networks simultaneously. Discovering the anchor links between the accounts owned by the same users across different social networks is crucial for many important inter-network applications, e.g., cross-network link transfer and cross-network recommendation. Many different supervised models have been proposed to predict anchor links so far, but they are effective only when the labeled anchor links are abundant. However, in real scenarios, such a requirement can hardly be met and most anchor links are unlabeled, since manually labeling the inter-network anchor links is quite costly and tedious. To overcome such a problem and utilize the numerous unlabeled anchor links in model building, in this paper, we introduce the active learning based anchor link prediction problem. Different from the traditional active learning problems, due to the one-to-one constraint on anchor links, if an unlabeled anchor link is identified as positive (i.e., existing), all the other unlabeled anchor links incident to account u or account v will be negative (i.e., non-existing) automatically. Viewed in such a perspective, asking for the labels of potential positive anchor links in the unlabeled set will be rewarding in the active anchor link prediction problem. Various novel anchor link information gain measures are defined in this paper, based on which several constraint active anchor link prediction methods are introduced. Extensive experiments have been done on real-world social network datasets to compare the performance of these methods with state-of-art anchor link prediction methods. The experimental results show that the proposed Mean-entropy-based Constrained Active Learning (MC) method can outperform other methods with significant advantages.

Highlights

  • Online social networks have become more and more popular in recent years, and are often represented as heterogeneous information networks containing abundant information about: who, where, when and what [1]

  • We have proved that the Mean-entropy-based Constrained Active Learning (MC) and Basic-entropy-based Constrained Active Learning (BBC) methods have great values on improving the performances of Multi-Network Anchoring (MNA) in the previous experiments

  • Different from the traditional query methods, our constraint active learning methods can label more than one link after an unlabeled anchor link has been queried

Read more

Summary

Introduction

Online social networks have become more and more popular in recent years, and are often represented as heterogeneous information networks containing abundant information about: who, where, when and what [1]. In order to predict anchor links between multiple social networks, many different supervised methods have been proposed so far These existing methods can achieve good performance only when sufficient labeled anchor links can be collected to train the models [1,8,9,10,11,12]. Different from the existing active learning methods, when identifying one positive anchor link, our methods can discover a group of negative anchor links that incident to its nodes via the one-to-one constraint, the challenge of one-to-one constraint on anchor links is solved.

Related Works
Problem Formulation
The Constrained Active Learning for Anchor Link Prediction
The Basic Anchor Link Prediction Method
The Constrained Active Learning Methods
The Normal Constrained Active Learning Methods
The Biased Constrained Active Learning Methods
Data Preparation
Experiment Setups
Effectiveness Experiments
Portability Experiments
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call