Evaluating Similarity Metrics for Latent Twitter Topics

Xi Wang,Iadh Ounis,Anjie Fang,Craig Macdonald

doi:10.1007/978-3-030-15712-8_54

Abstract

Topic modelling approaches such as LDA, when applied on a tweet corpus, can often generate a topic model containing redundant topics. To evaluate the quality of a topic model in terms of redundancy, topic similarity metrics can be applied to estimate the similarity among topics in a topic model. There are various topic similarity metrics in the literature, e.g. the Jensen Shannon (JS) divergence-based metric. In this paper, we evaluate the performances of four distance/divergence-based topic similarity metrics and examine how they align with human judgements, including a newly proposed similarity metric that is based on computing word semantic similarity using word embeddings (WE). To obtain human judgements, we conduct a user study through crowdsourcing. Among various insights, our study shows that in general the cosine similarity (CS) and WE-based metrics perform better and appear to be complementary. However, we also find that the human assessors cannot easily distinguish between the distance/divergence-based and the semantic similarity-based metrics when identifying similar latent Twitter topics.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluating Similarity Metrics for Latent Twitter Topics

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

What Happened in 2020: a Topic Modeling Approach based on a Topic Similarity Metric
Leonardo H Rocha ... Daniel Welter
iSys - Brazilian Journal of Information Systems | VOL. 15
Leonardo H Rocha, et. al.Leonardo H Rocha ... Daniel Welter
18 Oct 2022
What Happened in 2020: a Topic Modeling Approach based on a Topic Similarity Metric
Leonardo H Rocha ... Daniel Welter

Improving Effectiveness of Process Model Matchers Using Wordnet Glosses
Mostefai Abdelkader
International Journal of Systems and Software Security and Protection | VOL. 10
Mostefai AbdelkaderMostefai Abdelkader
01 Jul 2019
International Journal of Systems and Software Security and Protection | VOL. 10

A New Measure of Similarity in Textual Analysis: Vector Similarity Metric versus Cosine Similarity Metric
Rajendra P Srivastava
Journal of Emerging Technologies in Accounting | VOL. 20
Rajendra P SrivastavaRajendra P Srivastava
01 May 2023
Journal of Emerging Technologies in Accounting | VOL. 20

Cognitive decline assessment using semantic linguistic content and transformer deep learning architecture.
Rini Pl ... Gayathri Ks
International Journal of Language & Communication Disorders | VOL. 59
Rini Pl, et. al.Rini Pl ... Gayathri Ks
16 Nov 2023
International Journal of Language & Communication Disorders | VOL. 59

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluating Similarity Metrics for Latent Twitter Topics

Abstract

Talk to us

Similar Papers