Abstract

Topic models provide a convenient way to analyze large of unclassified text. A topic contains a cluster of words that frequently occur together. A topic modeling can connect words with similar meanings and distinguish between uses of words with multiple meanings. This paper provides two categories that can be under the field of topic modeling. First one discusses the area of methods of topic modeling, which has four methods that can be considerable under this category. These methods are Latent semantic analysis (LSA), Probabilistic latent semantic analysis (PLSA), Latent Dirichlet allocation (LDA), and Correlated topic model (CTM). The second category is called topic evolution models, which model topics by considering an important factor time. In the second category, different models are discussed, such as topic over time (TOT), dynamic topic models (DTM), multiscale topic tomography, dynamic topic correlation detection, detecting topic evolution in scientific literature, etc.

Highlights

  • To have a better way of managing the explosion of electronic document archives these days, it requires using new techniques or tools that deals with automatically organizing, searching, indexing, and browsing large collections

  • Probabilistic Latent Semantic Analysis (PLSA) is an approach that has been released after LSA method to fix some disadvantages that have found into LSA

  • Correlated Topic Model (CTM) is a kind of statistical model used in natural language processing and machine learning

Read more

Summary

A Survey of Topic Modeling in Text Mining

Information Systems Security CIISE, Concordia University Montreal, Quebec, Canada. Abstract—Topic Modeling provides a convenient way to analyze big unclassified text. This paper provides two categories that can be considered under the field of topic modeling. First one discusses the area of methods of Topic Modeling, which has four methods and can be considered under this category. These methods are Latent Semantic Analysis (LSA), Probabilistic Latent Semantic Analysis (PLSA), Latent Dirichlet Allocation (LDA), and Correlated Topic Model (CTM). The second category is called Topic Evolution Model, it considers an important factor time. In this category, different models are discussed, such as Topic Over Time (TOT), Dynamic Topic Models (DTM), Multiscale Topic Tomography, Dynamic Topic Correlation Detection, Detecting Topic Evolution in scientific literatures, etc

INTRODUCTION
THE METHODS OF TOPIC MODELING
Latent Semantic Analysis
Probabilistic Latent Semantic Analysis
Correlated topic model
Limitations
Overview of topic evolution models
A Non-Markov Continuous-Time Method
Multiscale Topic Tomography
Detecting Topic Evolution of Scientific Literature
Discovering the Topology of Topics
Summary of topic evolution models
Comparison of Two Categories
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call