GLTM: A Global and Local Word Embedding-Based Topic Model for Short Texts

Wenxin Liang,Xinyue Liu,Xianchao Zhang,Yuangang Li,Ran Feng

doi:10.1109/access.2018.2863260

Abstract

Short texts have become a kind of prevalent source of information, and discovering topical information from short text collections is valuable for many applications. Due to the length limitation, conventional topic models based on document-level word co-occurrence information often fail to distill semantically coherent topics from short text collections. On the other hand, word embeddings as a powerful tool have been successfully applied in natural language processing. Word embeddings trained on large corpus are encoded with general semantic and syntactic information of words, and hence they can be leveraged to guide topic modeling for short text collections as supplementary information for sparse co-occurrence patterns. However, word embeddings are trained on large external corpus and the encoded information is not necessarily suitable for training data set of topic models, which is ignored by most existing models. In this article, we propose a novel global and local word embedding-based topic model (GLTM) for short texts. In the GLTM, we train global word embeddings from large external corpus and employ the continuous skip-gram model with negative sampling (SGNS) to obtain local word embeddings. Utilizing both the global and local word embeddings, the GLTM can distill semantic relatedness information between words which can be further leveraged by Gibbs sampler in the inference process to strengthen semantic coherence of topics. Compared with five state-of-the-art short text topic models on four real-world short text collections, the proposed GLTM exhibits the superiority in most cases.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2018
Citations: 56	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

GLTM: A Global and Local Word Embedding-Based Topic Model for Short Texts

Abstract

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Topic Modeling for Short Texts with Auxiliary Word Embeddings
Chenliang Li ... Aixin Sun
-
Chenliang Li, et. al.Chenliang Li ... Aixin Sun
07 Jul 2016
07 Jul 2016

Enhancing Topic Modeling for Short Texts with Auxiliary Word Embeddings
Chenliang Li ... Aixin Sun
ACM Transactions on Information Systems | VOL. 36
Chenliang Li, et. al.Chenliang Li ... Aixin Sun
21 Aug 2017
ACM Transactions on Information Systems | VOL. 36

Investigating the Efficient Use of Word Embedding with Neural-Topic Models for Interpretable Topics from Short Texts.
Riki Murakami ... Basabi Chakraborty
Sensors | VOL. 22
Riki Murakami, et. al.Riki Murakami ... Basabi Chakraborty
23 Jan 2022
Sensors | VOL. 22

Short Text Topic Model with Word Embeddings and Context Information
Xianchao Zhang ... Ran Feng
-
Xianchao Zhang, et. al.Xianchao Zhang ... Ran Feng
27 Jun 2018
27 Jun 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

GLTM: A Global and Local Word Embedding-Based Topic Model for Short Texts

Abstract

Talk to us

Similar Papers

More From: IEEE Access