Abstract

User tags in social network are valuable information for many applications such as Web search, recommender systems and online advertising. Thus, extracting high quality tags to capture user interest has attracted many researchers’ study in recent years. Most previous studies inferred users’ interest based on text posted in social network. In some cases, ordinary users usually only publish a small number of text posts and text information is not related to their interest very much. Compared with famous user, it is more challenging to find non-famous (ordinary) user’s interest. In this paper, we propose a probabilistic topic model, Bi-Labeled LDA, to automatically find interest tags for non-famous users in social network such as Twitter. Instead of extracting tags from text posts, tags of non-famous users are inferred from interest topics of famous users. With the proposed model, the formulation of social relationship between non-famous users and famous user is simulated and interest tags of famous users are exploited to supervise the training of the model and to make use of latent relation among famous users. Furthermore, the influence of popularity of famous user and popular tags are considered, and tags of non-famous users are ranked based on random walk model. Experiments were conducted on Twitter real datasets. Comparison with state-of-the-art methods shows that our method is more superior in terms of both ranking and quality of the tagging results.

Highlights

  • Online social networking platforms like Twitter have become a mainstream medium, attracting millions of people spending their time there every day

  • We study two mining problems: how to extract interest tags for non-famous users based on social relationship among users in social network, without using tweet information, and how to rank tags of each non-famous user, capturing the importance of different tags

  • We propose a model to find interest tags for non-famous users based on the underlying social network in Twitter, without using tweet text information

Read more

Summary

Introduction

Online social networking platforms like Twitter have become a mainstream medium, attracting millions of people spending their time there every day. A key challenge of solving this problem is how to accurately infer the topics of interest for a user u. Most prior studies attempted to infer the topics of interest from the tweet content posted or retweeted by u in Twitter, mainly using topic models such as LDA and Labeled Latent Dirichlet Allocation (labeled LDA) [13]. The most competitive models for solving this problem using tweet content information. Some other approaches use both the tweet content and the social relationship information, mining users’ topics of interest from tweets and re-ranking users’ interests based on underlying Social network [12, 18]. Tweets users published usually cannot reflect or cover all topics of their interests

Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call