In educational testing, there is an increasing interest in the simultaneous estimation of the overall scores and subscores. This study aims to compare the reliability and precision of the simultaneous estimation of overall scores and sub-scores using MIRT, HO-IRT and Bi-factor models. TIMSS 2015 mathematics scores have been used as a data set in this study. The TIMSS 2015 mathematics test consists of 35 items, four of which are polytomously scored (0-1-2), and the rest of the items are dichotomously scored (0-1). The four content domains include number (14 items), algebra (9 items), geometry (6 items), and data and change (6 items). Ability parameters were estimated using the BMIRT software. The results showed that the MIRT and HO-IRT methods performed similarly in terms of precision and reliability for subscore estimates. The MIRT maximum information method had the smallest standard error of measurement for the overall score estimates. All three methods performed similarly in terms of the overall score reliability. The findings suggest that among the three methods compared, HO-IRT appears to be a better choice in the simultaneous estimation of the overall score and subscores for the data from TIMSS 2015. Recommendations for the testing practices and future research are provided.
Read full abstract