Agreeing to Disagree: Choosing Among Eight Topic-Modeling Methods

Qiang Fu,Xin Guo,Jiaxin Gu,Yufan Zhuang,Yushu Zhu

doi:10.1016/j.bdr.2020.100173

Abstract

Abstract Topic modeling is a key research area in natural language processing and has inspired innovative studies in a wide array of social-science disciplines. Yet, the use of topic modeling in computational social science has been hampered by two critical issues. First, social scientists tend to focus on a few standard ways of topic modeling. Our understanding of semantic patterns has not been informed by rapid methodological advances in topic modeling. Moreover, a systematic comparison of the performance of different methods in this field is warranted. Second, the choice of the optimal number of topics remains a challenging task. A comparison of topic-modeling techniques has rarely been situated in a social-science context and the choice appears to be arbitrary for most social scientists. Based on about 120,000 Canadian newspaper articles since 1977, we review and compare eight traditional, generative, and neural methods for topic modeling (Latent Semantic Analysis, Principal Component Analysis, Factor Analysis, Non-negative Matrix Factorization, Latent Dirichlet Allocation, Neural Autoregressive Topic Model, Neural Variational Document Model, and Hierarchical Dirichlet Process). Three measures (coherence statistics, held-out likelihood, and graph-based dimensionality selection) are then used to assess the performance of these methods. Findings are presented and discussed to guide the choice of topic-modeling methods, especially in social science research.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Big Data Research	Publication Date: Dec 16, 2020
Citations: 12	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Agreeing to Disagree: Choosing Among Eight Topic-Modeling Methods

Abstract

Talk to us

Similar Papers

More From: Big Data Research

Lead the way for us

Similar Papers

Search for K: Assessing Five Topic-Modeling Approaches to 120,000 Canadian Articles
Qiang Fu ... Yufan Zhuang
-
Qiang Fu, et. al.Qiang Fu ... Yufan Zhuang
01 Dec 2019
Search for K: Assessing Five Topic-Modeling Approaches to 120,000 Canadian Articles
Qiang Fu ... Yufan Zhuang

Evaluation of clustering and topic modeling methods over health-related tweets and emails
Juan Antonio Lossio-Ventura ... Jiang Bian
Artificial Intelligence in Medicine | VOL. 117
Juan Antonio Lossio-Ventura, et. al.Juan Antonio Lossio-Ventura ... Jiang Bian
07 May 2021
Artificial Intelligence in Medicine | VOL. 117

Using Topic Modeling Methods for Short-Text Data: A Comparative Analysis.
Rania Albalawi ... Tet Hin Yeap
Frontiers in Artificial Intelligence | VOL. 3
Rania Albalawi, et. al.Rania Albalawi ... Tet Hin Yeap
14 Jul 2020
Frontiers in Artificial Intelligence | VOL. 3

Topic modeling for feature location in software models: Studying both code generation and interpreted models
Francisca Pérez ... Raúl Lapeña
Information and Software Technology | VOL. 140
Francisca Pérez, et. al.Francisca Pérez ... Raúl Lapeña
01 Dec 2021
Information and Software Technology | VOL. 140

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Agreeing to Disagree: Choosing Among Eight Topic-Modeling Methods

Abstract

Talk to us

Similar Papers

More From: Big Data Research