Abstract

One insufficiently grounded criticism made against Latent Semantic Analysis is that it is impossible to semantically interpret its dimensions. This is not true, as several studies have transformed the latent semantic space to interpret them, by means of some methods. One of them is the Inbuilt-Rubric method. Rather than grouping concepts around dimensions, as in Exploratory Factor Analysis based rotation methods, the Inbuilt-Rubric is a method that perform an “a priori” imposition of concepts onto the latent semantic space. It uses a confirmatory strategy. This study seeks to propose solutions for two limitations found in the current Inbuilt-Rubric methodology: one solution is inspired by Bifactor Models and the management of common variance of the concepts involved; and the other one is based in randomizing the sequence to perform the process. Both methods outperform the current Inbuilt-Rubric version in relevant content detection. The reported improvements can be incorporated into expert systems that use Latent Semantic Analysis and Inbuilt-Rubric in relevant content detection or text classification tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call