Monolingual and Cross-Lingual Probabilistic Topic Models and Their Applications in Information Retrieval

Marie-Francine Moens,Ivan Vulić

doi:10.1007/978-3-642-36973-5_106

Abstract

AbstractProbabilistic topic models are a group of unsupervised generative machine learning models that can be effectively trained on large text collections. They model document content as a two-step generation process, i.e., documents are observed as mixtures of latent topics, while topics are probability distributions over vocabulary words. Recently, a significant research effort has been invested into transferring the probabilistic topic modeling concept from monolingual to multilingual settings. Novel topic models have been designed to work with parallel and comparable multilingual data (e.g., Wikipedia or news data discussing the same events). Probabilistic topics models offer an elegant way to represent content across different languages. Their probabilistic framework allows for their easy integration into a language modeling framework for monolingual and cross-lingual information retrieval. Moreover, we present how to use the knowledge from the topic models in the tasks of cross-lingual event clustering, cross-lingual document classification and the detection of cross-lingual semantic similarity of words. The tutorial also demonstrates how semantically similar words across languages are integrated as useful additional evidences in cross-lingual information retrieval models.KeywordsProbabilistic topic modelsCross-lingual retrievalRanking modelsCross-lingual text mining

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Monolingual and Cross-Lingual Probabilistic Topic Models and Their Applications in Information Retrieval

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Probabilistic topic modeling in multilingual settings: An overview of its methodology and applications
Ivan Vulić ... Marie-Francine Moens
Information Processing & Management | VOL. 51
Ivan Vulić, et. al.Ivan Vulić ... Marie-Francine Moens
07 Oct 2014
Information Processing & Management | VOL. 51

Correlated topic model for image annotation
Xing Xu ... Rin-Ichiro Taniguchi
-
Xing Xu, et. al.Xing Xu ... Rin-Ichiro Taniguchi
01 Jan 2013
01 Jan 2013

TopicEq: A Joint Topic and Mathematical Equation Model for Scientific Texts
Michihiro Yasunaga ... John D Lafferty
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 33
Michihiro Yasunaga, et. al.Michihiro Yasunaga ... John D Lafferty
17 Jul 2019
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 33

Improving biterm topic model with word embeddings
Jiajia Huang ... Pengwei Li
World Wide Web | VOL. 23
Jiajia Huang, et. al.Jiajia Huang ... Pengwei Li
08 Sep 2020
World Wide Web | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Monolingual and Cross-Lingual Probabilistic Topic Models and Their Applications in Information Retrieval

Abstract

Talk to us

Similar Papers