Abstract

Information Technologies have generated large amounts of documents available for analysis and use. Information systems can provide the user with the necessary data for a specific purpose without human intervention, saving time in providing the response expected by the user. Some traditional models of topic discovery provide essential information in the literature, but it is still necessary to incorporate the knowledge that a person can use when reading a document. In this work, an analysis of the behavior of the techniques of Latent Dirichlet Analysis, Latent Semantic Analysis, and Probabilistic Latent Semantic Analysis is carried out, incorporating the semantic relationships of the type hypernym, hyponym, synonymy, holonymy, and meronymy extracted from the external source of knowledge WordNet, in order to improve the results obtained by applying the three mentioned techniques in a set of documents without adding external knowledge. The experimental results improved when incorporating semantic relationships such as hypernyms and synonyms compared to the initial results, but the best result was when using a disambiguation algorithm Lesk and subsequently applying Latent Dirichlet Analysis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call