GDTM: Graph-based Dynamic Topic Models

Kambiz Ghoorchian,Magnus Sahlgren

doi:10.1007/s13748-020-00206-2

Kambiz Ghoorchian, Magnus Sahlgren

Open Access

https://doi.org/10.1007/s13748-020-00206-2

Copy DOI

Abstract

Dynamic Topic Modeling (DTM) is the ultimate solution for extracting topics from short texts generated in Online Social Networks (OSNs) like Twitter. It requires to be scalable and to be able to account for sparsity and dynamicity of short texts. Current solutions combine probabilistic mixture models like Dirichlet Multinomial or Pitman-Yor Process with approximate inference approaches like Gibbs Sampling and Stochastic Variational Inference to, respectively, account for dynamicity and scalability of DTM. However, these methods basically rely on weak probabilistic language models, which do not account for sparsity in short texts. In addition, their inference is based on iterative optimizations, which have scalability issues when it comes to DTM. We present GDTM, a single-pass graph-based DTM algorithm, to solve the problem. GDTM combines a context-rich and incremental feature representation method with graph partitioning to address scalability and dynamicity and uses a rich language model to account for sparsity. We run multiple experiments over a large-scale Twitter dataset to analyze the accuracy and scalability of GDTM and compare the results with four state-of-the-art models. In result, GDTM outperforms the best model by 11% on accuracy and performs by an order of magnitude faster while creating four times better topic quality over standard evaluation metrics.

Highlights

Motivation topic modeling [1] is the problem of automatic classification of words, which form the context of documents, into similarity groups, known as topics
Given a dataset D with n documents, tagged with k hand labels, L = {l1, . . . , lk} and a classification of the documents into k class labels, C = {c1, . . . , ck}, the B-Cubed of a document d with hand label ld and class label cd is calculated as: we demonstrate the accuracy and scalability of GDTM by running the algorithm over two sets of experiments
We developed GDTM, a solution for dynamic topic modeling on short texts in online social networks

Summary

Introduction

Motivation topic modeling [1] is the problem of automatic classification of words, which form the context of documents, into similarity groups, known as topics. Documents generated in today’s social media (like Twitter or Facebook) are (i) fast (large scale and continuous), (ii) sparse (short length) and (iii) dynamic (with constant emergent of newly generated phrases or context structures). This is a problem known as Dynamic Topic. A legitimate solution to DTM should constantly receive a large number of short texts, extract their topics and adapt to the changes in the topics

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Progress in Artificial Intelligence	Publication Date: May 15, 2020
Citations: 4	License type: open-access

R Discovery Prime

R Discovery Prime

GDTM: Graph-based Dynamic Topic Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Progress in Artificial Intelligence

Lead the way for us

Similar Papers

Search Result Diversification in Short Text Streams
Shangsong Liang ... Hong Shen
ACM Transactions on Information Systems | VOL. 36
Shangsong Liang, et. al.Shangsong Liang ... Hong Shen
17 Jul 2017
ACM Transactions on Information Systems | VOL. 36

Video Behaviour Mining Using a Dynamic Topic Model
Timothy Hospedales ... Tao Xiang
International Journal of Computer Vision | VOL. 98
Timothy Hospedales, et. al.Timothy Hospedales ... Tao Xiang
08 Dec 2011
International Journal of Computer Vision | VOL. 98

Dynamic Topic Model for Short Texts about Hot Issues on Microblog Based on TFIDF
Zhourui Zhang
-
Zhourui ZhangZhourui Zhang
01 Mar 2022
01 Mar 2022

Implementation of Dynamic Topic Modeling to Discover Topic Evolution on Customer Reviews
Valentinus Roby Hananto
Jurnal Online Informatika | VOL. 8
Valentinus Roby HanantoValentinus Roby Hananto
28 Dec 2023
Jurnal Online Informatika | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

GDTM: Graph-based Dynamic Topic Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Progress in Artificial Intelligence