Topic representation: Finding more representative words in topic models

Jinjin Chi,Jihong Ouyang,Changchun Li,Xueyang Dong,Ximing Li,Xinhua Wang

doi:10.1016/j.patrec.2019.01.018

Abstract

The top word list, i.e., the top-M words with highest marginal probabilities in a given topic, is the standard topic representation in topic models. Most of recent automatical topic labeling algorithms and popular topic quality metrics are based on it. However, we find, empirically, words in this type of top word list are not always representative. The objective of this paper is to find more representative top word lists for topics. To achieve this, we rerank the words in a given topic by further considering marginal probabilities on words over every other topic. The reranking list of top-M words is used to be a novel topic representation for topic models. We investigate three reranking methodologies, using (1) standard deviation weight, (2) standard deviation weight with topic size and (3) Chi Square χ2 statistic selection. Experimental results on real-world collections indicate that our representations can extract more representative words for topics, agreeing with human judgements.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Topic representation: Finding more representative words in topic models

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition Letters

Lead the way for us

Journal: Pattern Recognition Letters	Publication Date: Mar 18, 2019
Citations: 12

Similar Papers

A Topic Representation Model for Online Social Networks Based on Hybrid Human–Artificial Intelligence
Weihong Han ... Zizhong Huang
IEEE Transactions on Computational Social Systems | VOL. 8
Weihong Han, et. al.Weihong Han ... Zizhong Huang
03 Jan 2020
IEEE Transactions on Computational Social Systems | VOL. 8

An Overview of Topic Representation and Topic Modelling Methods for Short Texts and Long Corpus
D Yamunathangam ... G Shobana
-
D Yamunathangam, et. al.D Yamunathangam ... G Shobana
08 Oct 2021
08 Oct 2021

Quality indices for topic model selection and evaluation: a literature review and case study
Christopher Meaney ... Michael Escobar
BMC Medical Informatics and Decision Making | VOL. 23
Christopher Meaney, et. al.Christopher Meaney ... Michael Escobar
22 Jul 2023
BMC Medical Informatics and Decision Making | VOL. 23

Enhancing Graph Variational Autoencoder for Short Text Topic Modeling with Mutual Information Maximization
Yuhang Ge ... Xuegang Hu
-
Yuhang Ge, et. al.Yuhang Ge ... Xuegang Hu
01 Nov 2022
01 Nov 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Topic representation: Finding more representative words in topic models

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition Letters