Abstract

Test equating is a statistical procedure for adjusting for test form differences in difficulty in a standardized assessment. Equating results are supposed to hold for a specified target population (Kolen & Brennan, 2004; von Davier, Holland, & Thayer, 2004) and to be (relatively) independent of the subpopulations from the target population (see Dorans & Holland, 2000; Zumbo, 2007). This study discusses the challenges in defining a target population for test equating and in validating the inferences that can be made for the equated scores when the test takers cluster in distinctive ability groups. This discussion takes place in the context of measurement validity (Zumbo, 2007) and optimal sampling design (Berger, 1997). This article discusses an alternative observed-score equating (OSE) approach that takes the advantage of the OSE framework described in von Davier (2011). The flexibility of the OSE framework and the availability of the standard error of equating difference, which is the standard error of the difference between two equating functions obtained from using two different methods, allow practitioners to compare statistically the equating results from different weighting schemes for distinctive subgroups of the target population. Simulated and real data were used in this study. Item response theory OSE with multigroup calibration, followed by computing the distributions with appropriate sampling weights, was used as the equating criterion.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call