Examining test fairness across gender in a computerised reading test: A comparison between the Rasch-based DIF technique and MIMIC

Xuelian Zhu,Vahid Aryadoust

doi:10.58379/nvft3338

Abstract

Test fairness has been recognised as a fundamental requirement of test validation. Two quantitative approaches to investigate test fairness, the Rasch-based differential item functioning (DIF) detection method and a measurement invariance technique called multiple indicators, multiple causes (MIMIC), were adopted and compared in a test fairness study of the Pearson Test of English (PTE) Academic Reading test (n = 783). The Rasch partial credit model (PCM) showed no statistically significant uniform DIF across gender and, similarly, the MIMIC analysis showed that measurement invariance was maintained in the test. However, six pairs of significant non-uniform DIF (p < 0.05) were found in the DIF analysis. A discussion of the results and post-hoc content analysis is presented and the theoretical and practical implications of the study for test developers and language assessment are discussed.

Full Text