Abstract

Developing a common metric is essential to successful applications of item response theory to practical testing problems, such as equating, differential item functioning, and computerized adaptive testing. In this study, the authors compared two methods for developing a common metric for the graded response model under item response theory: (a) linking separate calibration runs using equating coefficients from the characteristic curve method and (b) concurrent calibration using the combined data of the base and target groups. Concurrent calibration yielded consistently albeit only slightly smaller root mean square differences for both item discrimination and location parameters. Similar results were observed for distance measures between item parameter estimates and item parameters. Concurrent calibration also yielded consistently though only slightly smaller root mean square differences for ability than linking.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call