Analysis of Mixed-Format Assessments Using Measurement Models and Topic Modeling

Jiawei Xiong,George Engelhard,Allan S Cohen

doi:10.1080/15366367.2023.2298135

Abstract

ABSTRACT It is common to find mixed-format data results from the use of both multiple-choice (MC) and constructed-response (CR) questions on assessments. Dealing with these mixed response types involves understanding what the assessment is measuring, and the use of suitable measurement models to estimate latent abilities. Past research in educational measurement, however, has often overlooked the written responses in CR items after analyzing the response scores. This study presents a method for bridging this gap by using a topic model called latent Dirichlet allocation to uncover the structure in written answers and then use that information to augment results from traditional measurement models. In this study, a five-step framework is employed for assessing both the examinees’ latent abilities and the internal structure of the written responses obtained from mixed-format assessments. An empirical dataset obtained from Grade 8 examinees in a southeastern state on an English language arts assessment is used for illustration. Based on the dimensionality of the assessment, a unidimensional partial credit model and two multidimensional bi-factor models were fit to the data. A comparison of results from these analyses suggests that a constrained bi-factor model was most useful for detecting the mixed-format score patterns, and that a 5-topic model was determined for the textual responses. The topic distributions were found to be related to the latent abilities estimated from the constrained bi-factor model. This framework combines both the score patterns and textual responses, and highlights the utility of combining the traditional response analysis with a start-of-art language model.

Full Text