Identifying Topics in Microblogs Using Wikipedia.

Ahmet Yıldırım,Suzan Üsküdarlı,Arzucan Özgür

doi:10.1371/journal.pone.0151885

Ahmet Yıldırım, Suzan Üsküdarlı + Show 1 more

Open Access

https://doi.org/10.1371/journal.pone.0151885

Copy DOI

Journal: PLOS ONE	Publication Date: Mar 18, 2016
Citations: 16	License type: CC BY 4.0

Affiliation: Boğaziçi University

Abstract

Twitter is an extremely high volume platform for user generated contributions regarding any topic. The wealth of content created at real-time in massive quantities calls for automated approaches to identify the topics of the contributions. Such topics can be utilized in numerous ways, such as public opinion mining, marketing, entertainment, and disaster management. Towards this end, approaches to relate single or partial posts to knowledge base items have been proposed. However, in microblogging systems like Twitter, topics emerge from the culmination of a large number of contributions. Therefore, identifying topics based on collections of posts, where individual posts contribute to some aspect of the greater topic is necessary. Models, such as Latent Dirichlet Allocation (LDA), propose algorithms for relating collections of posts to sets of keywords that represent underlying topics. In these approaches, figuring out what the specific topic(s) the keyword sets represent remains as a separate task. Another issue in topic detection is the scope, which is often limited to specific domain, such as health. This work proposes an approach for identifying domain-independent specific topics related to sets of posts. In this approach, individual posts are processed and then aggregated to identify key tokens, which are then mapped to specific topics. Wikipedia article titles are selected to represent topics, since they are up to date, user-generated, sophisticated articles that span topics of human interest. This paper describes the proposed approach, a prototype implementation, and a case study based on data gathered during the heavily contributed periods corresponding to the four US election debates in 2012. The manually evaluated results (0.96 precision) and other observations from the study are discussed in detail.

Highlights

Twitter [1] is the most popular microblogging system in the world with over 280 million active users tweeting around 40K posts/s [2]
We propose to use the titles of Wikipedia articles to represent topics
Considering the limited length of microblog posts which leads to a limited context, and discarding the descriptive content of Wikipedia article bodies may lead to less inclusive and less descriptive topics as we show in Comparison of processing single-microblog posts and microblog post sets section while examining some cases by comparing the results between an approach that aggregates what [34, 35] returns and our own proposed approach

Summary

Introduction

Twitter [1] is the most popular microblogging system in the world with over 280 million active users tweeting around 40K posts/s [2]. It serves as a collective platform where users tweet (post) anything about anything [3], such as current events, sports, politics, health, conferences, personal life, etc. This way, the author makes a connection between Obama’s words and the context of the debate. He can add his opinion on the subject if he wants

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Identifying Topics in Microblogs Using Wikipedia.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE

Lead the way for us

Similar Papers

Experiments in Microblog Summarization
Beaux Sharifi ... Mark-Anthony Hutton
-
Beaux Sharifi, et. al.Beaux Sharifi ... Mark-Anthony Hutton
01 Aug 2010
01 Aug 2010

Microblog Topic Detection Based on LDA Model and Single-Pass Clustering
Bo Huang ... Yan Yang
-
Bo Huang, et. al.Bo Huang ... Yan Yang
01 Jan 2012
01 Jan 2012

Microblog topic identification using Linked Open Data.
Ahmet Yıldırım ... Suzan Uskudarli
PloS one | VOL. 15
Ahmet Yıldırım, et. al.Ahmet Yıldırım ... Suzan Uskudarli
11 Aug 2020
PloS one | VOL. 15

A hot topic detection method for Chinese Microblog based on topic words
Jun Zheng ... Yuanjun Li
-
Jun Zheng, et. al.Jun Zheng ... Yuanjun Li
01 Dec 2014
01 Dec 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Identifying Topics in Microblogs Using Wikipedia.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE