Latent Dirichlet Allocation - An approach for topic discovery

Astha Goyal,Indu Kashyap

doi:10.1109/com-it-con54601.2022.9850912

Abstract

The digital age has brought about an increased data generation and, therefore, challenges to process that data. Machine learning and NLP algorithms have enabled smooth data processing as the research progressed technologically. Machine learning algorithms can be classified as supervised, unsupervised, and reinforcement learning. Topic Modeling is one such algorithm that follows unsupervised machine learning techniques. Mining text to discover the hidden semantic structure; topics in a text body is a typical topic modeling application. Numerous techniques fall under topic modeling including Latent Semantic Analysis (LSA), Probabilistic Latent Semantic Analysis (PLSA), Latent Dirichlet Allocation (LDA), Non-Negative Matrix Factorization (NMF). Latent Dirichlet Allocation (LDA) is considered the most prevalent topic modeling method and has substantially evolved from where it had begun. LDA now has transformed into many variants such as Hierarchical LDA Model (hLDA), Dynamic Topic Model (DTM), Correlated Topic Model (CTM), Pachinko Allocation Topic Model (PAM), and Author Topic Model. Through this paper, LDA, its advancements, and its applications are being assessed and analyzed.

Full Text