Modeling Test-Taking Non-effort in MIRT Models.

Yue Liu,Fang Luo,Zhen Li,Hongyun Liu

doi:10.3389/fpsyg.2019.00145

Yue Liu, Fang Luo + Show 2 more

Open Access

https://doi.org/10.3389/fpsyg.2019.00145

Copy DOI

Journal: Frontier in Psychology	Publication Date: Feb 4, 2019
Citations: 16	License type: CC BY 4.0

Affiliation: Beijing Normal University

Abstract

The validity of inferences based on test scores will be threatened when examinees' test-taking non-effort is ignored. A possible solution is to add test-taking effort indicators in the measurement model after the non-effortful responses are flagged. As a new application of the multidimensional item response theory (MIRT) model for non-ignorable missing responses, this article proposed a MIRT method to account for non-effortful responses. Two simulation studies were conducted to examine the impact of non-effortful responses on item and latent ability parameter estimates, and to evaluate the performance of the MIRT method, comparing to the three-parameter logistic (3PL) model as well as the effort-moderated model. Results showed that: (a) as the percentage of non-effortful responses increased, the unidimensional 3PL model yielded poorer parameter estimates; (b) the MIRT model could obtain as accurate item parameter estimates as the effort-moderated model; (c) the MIRT model provided the most accurate ability parameter estimates when the correlation between test-taking effort and ability was high. A real data analysis was also conducted for illustration. The limitation and future research were discussed further.

Highlights

Reviewed by: Okan Bulut, University of Alberta, Canada Lihua Yao, United States Department of Defense, United States
Results showed that: (a) as the percentage of non-effortful responses increased, the unidimensional 3PL model yielded poorer parameter estimates; (b) the multidimensional item response theory (MIRT) model could obtain as accurate item parameter estimates as the effort-moderated model; (c) the MIRT model provided the most accurate ability parameter estimates when the correlation between test-taking effort and ability was high
For all of the conditions considered in this study, the Root Mean Squared Error (RMSE) of parameter estimates by the MIRT model or the effort-moderated model were much smaller than the 3PL estimates and barely any difference between the estimates of item parameters under the former two models can be observed

Summary

Introduction

Wise and Kong (2005) noted three situations where non-effortful responses could happen: (a) assessment programs (e.g., PISA) that have serious potential consequences for institutions but few consequences for examinees; (b) high-stakes testing programs that sometimes administer test items in low-stakes settings, such as in the pilot study of a test program (Cheng et al, 2014); (c) a substantial amount of measurement studies conducted in low-stakes settings at colleges and universities. When an unidimensional item response theory (IRT) model is applied to test scoring, test-taking non-effort leads to biased estimations of both item parameters and latent abilities (Wise and DeMars, 2006). It was shown that the group means would be underestimated by around 0.20 SDs if the total amount of non-effortful responses exceeded 6.25, 12.5, and 12.5% for easy, moderately difficult, and difficult tests respectively (Rios et al, 2017)

Objectives

Methods

Results

Conclusion