Abstract

Due to the plethora of documents containing large scale of text that are available on web it sometimes gets difficult to go through each document to get the clear picture of what the text is depicting. In this paper, we are analyzing several techniques to evaluate Topic Model. A Topic Model is a very popular approach for representing and smoothing the content of documents. Here we will focus on uncovering the thematic structure of a corpus of document that will help in document classification and for compact document topic representation. We have gone through some of the famous topic model such as-Latent Semantic Indexing (LSI),Probabilistic Latent Semantic Indexing (PLSI),Latent Dirichlet Allocation (LDA),Pachinko Allocation Model (PAM) where we encounter few issues such as Topic models are not proper for some SNS such as micro blogging and supervise learning techniques are designed for one-labeled corpus-i. e. they are limiting the document to a single label.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call