Abstract

In this article, we investigate the creation of comparable score scales across countries in international assessments. We examine potential improvements to current score scale calibration procedures used in international large-scale assessments. Our approach seeks to improve fairness in scoring international large-scale assessments, which often ignore item misfit in score scale calibrations. We also seek to obtain improved model-data fit estimates when calibrating international score scales. To this end, we examine the use of two alternative score scale calibration procedures: (a) a language-based score scale and (b) a more parsimonious international scale wherein a large proportion of international parameters are used with a subset of country-based parameters for items that misfit in the international scale. In our analyses, we used data from all 40 countries participating in the Progress in International Reading Literacy Study. Our findings revealed that current score scale calibration procedures yield large numbers of misfitting items (higher than 25% for some countries). Our proposed approach diminished the effects of proportion of item misfit on score scale calibrations and also yielded enhanced model-data fit estimates. These results lead to enhancing confidence in measurements obtained from international large-scale assessments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call