A Pseudo-document-based Topical N-grams model for short texts

Hao Lin,Yuan Zuo,Junjie Wu,Hong Li,Zhiang Wu,Guannan Liu

doi:10.1007/s11280-020-00814-x

Abstract

In recent years, short text topic modeling has drawn considerable attentions from interdisciplinary researchers. Various customized topic models have been proposed to tackle the semantic sparseness nature of short texts. Most (if not all) of them follow the bag-of-words assumption, which, however, is not adequate since word order and phrases are often critical to capturing the meaning of texts. On the other hand, while some existing topic models are sensitive to word order, they do not perform well on short texts due to the severe data sparseness. To address these issues, we propose the Pseudo-document-based Topical N-Grams model (PTNG), which alleviates the data sparsity problem of short texts while is sensitive to word order. Extensive experiments on three real-world data sets with state-of-the-art baselines demonstrate the high quality of topics learned by PTNG according to UCI coherence scores and more discriminative semantic representation of short texts according to classification results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Pseudo-document-based Topical N-grams model for short texts

Abstract

Talk to us

Similar Papers

More From: World Wide Web

Lead the way for us

Journal: World Wide Web	Publication Date: Jul 23, 2020
Citations: 5

Similar Papers

Topic Modeling of Short Texts
Yuan Zuo ... Hui Zhang
-
Yuan Zuo, et. al.Yuan Zuo ... Hui Zhang
13 Aug 2016
13 Aug 2016

A systematic review of the use of topic models for short text social media analysis
Caitlin Doogan Poet Laureate ... Henry Linger
Artificial Intelligence Review | VOL. 56
Caitlin Doogan Poet Laureate, et. al.Caitlin Doogan Poet Laureate ... Henry Linger
01 May 2023
Artificial Intelligence Review | VOL. 56

GLTM: A Global and Local Word Embedding-Based Topic Model for Short Texts
Wenxin Liang ... Yuangang Li
IEEE Access | VOL. 6
Wenxin Liang, et. al.Wenxin Liang ... Yuangang Li
01 Jan 2018
IEEE Access | VOL. 6

Semantic Augmented Topic Model over Short Text
Lingyun Li ... Yawei Sun
-
Lingyun Li, et. al.Lingyun Li ... Yawei Sun
01 Nov 2018
01 Nov 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Pseudo-document-based Topical N-grams model for short texts

Abstract

Talk to us

Similar Papers

More From: World Wide Web