Abstract
This paper introduces three classic models of statistical topic models: Latent Semantic Indexing (LSI), Probabilistic Latent Semantic Indexing (PLSI) and Latent Dirichlet Allocation (LDA). Then a method of text classification based on LDA model is briefly described, which uses LDA model as a text representation method. Each document means a probability distribution of fixed latent topic sets. Next, Support Vector Machine (SVM) is chose as classification algorithm. Finally, the evaluation parameters in classification system of LDA with SVM are higher than other two methods which are LSI with SVM and VSM with SVM, showing a better classification performance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have