Abstract

Large-scale assessments (LSAs) use Mislevy’s “plausible value” (PV) approach to relate student proficiency to noncognitive variables administered in a background questionnaire. This method requires background variables to be completely observed, a requirement that is seldom fulfilled. In this article, we evaluate and compare the properties of methods used in current practice for dealing with missing data in background variables in educational LSAs, which rely on the missing indicator method (MIM), with other methods based on multiple imputation. In this context, we present a fully conditional specification (FCS) approach that allows for a joint treatment of PVs and missing data. Using theoretical arguments and two simulation studies, we illustrate under what conditions the MIM provides biased or unbiased estimates of population parameters and provide evidence that methods such as FCS can provide an effective alternative to the MIM. We discuss the strengths and weaknesses of the approaches and outline potential consequences for operational practice in educational LSAs. An illustration is provided using data from the PISA 2015 study.

Highlights

  • The theoretical properties of the methods based on the missing indicator method (MIM) (MIM-PE, MIM-LD, and MIM was combined with multiple imputation (MI) (MIM-MI)) suggested that a naive use of MIM-LD and MIM-MI may lead to biased parameter estimates even when the data are MCAR, this may not necessarily be a reason for concern in the context of educational LSAs, where the large number of background variables may compensate for the effects of missing data under the MIM

  • This is an encouraging finding because it illustrates that the MIM allows for approximately unbiased parameter estimates at least under MCAR

  • The same was true for MIM-MI, which led to a substantial reduction in the bias in most cases, some bias remained in a few individual parameters under MARx and MARy

Read more

Summary

Objectives

We aimed to evaluate the procedures currently used for handling missing data in background variables in educational LSAs in an attempt to clarify the conditions under which they can cause problems. We aimed to implement and evaluate a strategy that allows for a joint modeling of PVs for the proficiency scores and imputations of missing data in background variables. The main purpose of this study was to investigate whether and to what extent our findings can be expected to carry over to real data, where there is often a large number of background variables with a complex correlation structure and diverse patterns of missing data, which may show compensatory effects that are not observable in cases with fewer variables

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call