Abstract

Four item response theory linking methods (2 moment methods and 2 characteristic curve methods) were compared to concurrent (CO) calibration with the focus on the degree of robustness to format effects (FEs) when applying the methods to multidimensional data that reflected the FEs associated with mixed-format tests. Based on the quantification of FEs as the correlation between 2 dominant constructs measured by multiple-choice items and constructed-response items, a hypothetical yet possibly practical situation was assumed where FEs occurred in such a way that a mixed-format test had dimensions aligned with item format. Among linking methods, the characteristic curve methods outperformed the moment methods, regardless of the presence of FEs. In general, CO calibration outperformed the 4 linking methods in linking accuracy and robustness to FEs. However, the performance of CO calibration was only slightly better than that of the characteristic curve methods. Although CO calibration and the characteristic curve methods showed some evidence of being robust to severe FEs (correlation of 0.5), the evidence did not seem to be consistent across test types.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call