Abstract
Text mining is the process to deriving useful information from unstructured text data. During this process, text mining uses statistical and mathematical methods. Major text mining tasks include text categorization, text clustering, concept extraction, document summarization, semantic similarity and author identification. In this study, semantic similarity issues have been examined. Semantic similarity analysis aims to determine semantic similarity between texts. Probabilistic latent semantic analysis and latent Dirichlet allocation are probabilistic methods to determine semantic similarity between texts. In this study, semantic analysis using probabilistic latent semantic analysis and latent Dirichlet allocation methods is examined. Also, an application which is conducted to analyze semantic similarity and classify Turkish textual data chosen from different news agencies is discussed. R statistical programming language and Matlab are used in the application.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have