Abstract

A shadow-test approach to the calibration of field-test items embedded in adaptive testing is presented. The objective function used in the shadow-test model selects both the operational and field-test items adaptively using a Bayesian version of the criterion of D_{mathrm{s}}-optimality. The constraint set for the model can be used to hide the field-test items completely in the content of the test as well as to deal with such practical issues as random control of their exposure rates. The approach runs on efficient implementations of the Gibbs sampler for the real-time updating of the ability and field-test parameters. Optimal settings for the proposed algorithms were found and used to demonstrate item calibration with smaller than traditional sample sizes in runtimes fully comparable with conventional adaptive testing.

Highlights

  • The current focus of adaptive testing programs typically is on the selection of items from an operational pool with optimization of the estimates of the examinees’ ability parameters as statistical objective

  • Just as adaptive selection of the operational items leads to a considerable reduction of the test length necessary to score the examinees, similar selection of the field-test items can be expected to lead to substantial reduction of the sample size necessary to calibrate new items, an advantage already demonstrated in a few recent item calibration studies (Ren et al 2017; van der Linden and Ren 2015)

  • Key features of the proposed approach to real-time item calibration are embedding of the field-test items in operational adaptive testing in random positions near the end of the test, representation of each of the response model parameters by short vectors of draws from their posterior distributions instead of fixed point estimates, sequential updating of these vector of posterior draws

Read more

Summary

Introduction

The current focus of adaptive testing programs typically is on the selection of items from an operational pool with optimization of the estimates of the examinees’ ability parameters as statistical objective. As for the item pool, it is not uncommon for it to be calibrated with response data collected in separate field-test studies with the items organized as sets of linked test forms administered to volunteering examinees. This type of field-test studies requires considerable resources. Other possible advantages of embedded field-testing of new items are more practical and include saving the time and expenses necessary to run separate calibration studies, collection of responses from motivated examinees answering the items under operational conditions, possibility of real-time monitoring of item fit and early withdrawal of items that appear

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call