Evaluating tag quality for blogger modelling via topic models

Lili Shan Lili Shan,Chengjie Sun Chengjie Sun,Xiaolong Wang Xiaolong Wang,Ming Liu Ming Liu,Lei Lin Lei Lin,Bingquan Liu Bingquan Liu

doi:10.1109/fskd.2015.7382215

Abstract

with the permission of annotating blog posts with tags, tags has become one of the most important resources used to describe blogger features. However, due to the irregular quality of tags, not all tags are appropriate for representing blogger's preferences. Poor tags or spam tags confuse the actual user's preferences and spam terms, thus they should be detected before they are directly used to tag bloggers. A detailed quantitative analysis on the categories of tag spam in the blogosphere is presented in this paper. Taking advantage of abundant text contents in blog posts and the relatively stable semantic relationship between tags and their target posts, an unsupervised approach based on topic models is proposed to evaluate tag quality for blogger modelling in the blogosphere. The latent interest topics of a blogger are mined out through Latent Dirichlet Allocation (LDA) topic modeling. The blog post of the blogger is represented as a distribution over latent topics and a latent topic is a distribution over words of the vocabulary. A tag is also expressed as a specific co-occurrence term vector. Ultimately, a scheme is devised to determine the similarity between each tag and its target blog post. Then the tags with less similarity value can be identified as poor tag. The experimental results indicate that the proposed method achieves more promising performance than the baselines on datasets collected from Sina Blog, which is one of the biggest Chinese blogs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluating tag quality for blogger modelling via topic models

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Latent Dirichlet allocation topic modeling of free-text responses exploring the negative impact of the early COVID-19 pandemic on research in nursing.
Madoka Inoue ... Hideo Tohira
Japan journal of nursing science : JJNS | VOL. 20
Madoka Inoue, et. al.Madoka Inoue ... Hideo Tohira
30 Nov 2022
Japan journal of nursing science : JJNS | VOL. 20

Hashtag Recommendation System in a P2P Social Networking Application
Keerthi Nelaturu ... Ying Qiao
-
Keerthi Nelaturu, et. al.Keerthi Nelaturu ... Ying Qiao
23 Jul 2015
23 Jul 2015

Content Management and Hashtag Recommendation in a P2P Social Networking Application

-

01 Jan 2015
01 Jan 2015

Topic Model—Machine Learning Classifier Integrations on Geocoded Twitter Data
Gillian Kant ... Benjamin Säfken
-
Gillian Kant, et. al.Gillian Kant ... Benjamin Säfken
23 Nov 2022
23 Nov 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluating tag quality for blogger modelling via topic models

Abstract

Talk to us

Similar Papers