Evaluating Item Fit Statistic Thresholds in PISA: Analysis of Cross‐Country Comparability of Cognitive Items

Seang‐Hwane Joo,Lale Khorramdel,Frederic Robin,Hyo Jeong Shin,Kentaro Yamamoto

doi:10.1111/emip.12404

Seang‐Hwane Joo, Lale Khorramdel + Show 3 more

https://doi.org/10.1111/emip.12404

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

AbstractIn Programme for International Student Assessment (PISA), item response theory (IRT) scaling is used to examine the psychometric properties of items and scales and to provide comparable test scores across participating countries and over time. To balance the comparability of IRT item parameter estimations across countries with the best possible model fit, a partial invariance approach is used in PISA. In this approach, international or common item parameters are estimated for the majority of items, while unique or country‐specific item parameters are allowed for item‐country combinations where a misfit to the common parameters can be identified. The goal of the current study is to establish item fit statistic thresholds for identifying such misfits. We investigated the impact of various thresholds on scale and score estimation. To evaluate the impact of various item fit thresholds, we systematically examined the number of unique item parameters and country performance distributions and compared the overall model fit statistics using data from PISA 2015 and 2018. Results showed that RMSD = .10 provides the best fitting model while still establishing stable parameter estimations and sufficient comparability across groups. The applications and implications of the results are discussed.

Full Text