Abstract

Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming more common in educational measurement research as a method for analyzing students’ responses to constructed-response items. Two popular topic models are latent semantic analysis (LSA) and latent Dirichlet allocation (LDA). LSA uses linear algebra techniques, whereas LDA uses an assumed statistical model and generative process. In educational measurement, LSA is often used in algorithmic scoring of essays due to its high reliability and agreement with human raters. LDA is often used as a supplemental analysis to gain additional information about students, such as their thinking and reasoning. This article reviews and compares the LSA and LDA topic models. This article also introduces a methodology for comparing the semantic spaces obtained by the two models and uses a simulation study to investigate their similarities.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call