Parallel inference for cross-collection latent generalized Dirichlet allocation model and applications

Zhiwen Luo,Manar Amayri,Wentao Fan,Koffi Eddy Ihou,Nizar Bouguila

doi:10.1016/j.eswa.2023.121720

Abstract

Existing cross-collection topic models with document-topic representation encounter performance bottlenecks in large-scale datasets due to their reliance on Dirichlet priors and conventional inference schemes. These constraints become noticeable in models derived from the Latent Dirichlet Allocation (LDA) framework. To address these challenges, this paper introduces the GPU-accelerated cross-collection latent generalized Dirichlet allocation (gccLGDA) model. This innovative approach integrates the benefits of generalized Dirichlet (GD) distribution with the computational prowess of GPU-based parallel inference, offering enhanced cross-collection topic modeling. The gccLGDA employs the GD distribution presenting a more flexible prior with a comprehensive covariance structure, enabling a more nuanced capture of relationships between latent topics across different collections. Leveraging GPU for parallel inference, our model promises scalable and efficient training for expansive datasets, making it apt for large-scale data challenges. Through empirical evaluations in comparative text mining and document classification, we demonstrate the enhanced performance of the gccLGDA, highlighting its advantages over existing cross-collection topic models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Parallel inference for cross-collection latent generalized Dirichlet allocation model and applications

Abstract

Talk to us

Similar Papers

More From: Expert Systems With Applications

Lead the way for us

Similar Papers

A Semi-Supervised Text Clustering Algorithm with Word Distribution Weights
Jiayin Wei ... Yongbin Qin
-
Jiayin Wei, et. al.Jiayin Wei ... Yongbin Qin
01 Jan 2013
01 Jan 2013

Bayesian Folding-In Using Generalized Dirichlet and Beta-Liouville Kernels for Information Retrieval
Sahar Salmanzade Yazdi ... Nizar Bouguila
-
Sahar Salmanzade Yazdi, et. al.Sahar Salmanzade Yazdi ... Nizar Bouguila
04 Dec 2022
04 Dec 2022

Topic models with power-law using Pitman-Yor process
Issei Sato ... Hiroshi Nakagawa
-
Issei Sato, et. al.Issei Sato ... Hiroshi Nakagawa
25 Jul 2010
25 Jul 2010

Statistical modeling of biomedical corpora: mining the Caenorhabditis Genetic Center Bibliography for genes related to life span.
D M Blei ... I S Mian
BMC bioinformatics | VOL. 7
D M Blei, et. al.D M Blei ... I S Mian
08 May 2006
BMC bioinformatics | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Parallel inference for cross-collection latent generalized Dirichlet allocation model and applications

Abstract

Talk to us

Similar Papers

More From: Expert Systems With Applications