Detecting Differential Item Functioning in 2PL Multistage Assessments

Rudolf Debelak,Dries Debeer,Martin J Tomasik,Sebastian Appelbaum

doi:10.3390/psych5020031

Rudolf Debelak, Dries Debeer + Show 2 more

Open Access

https://doi.org/10.3390/psych5020031

Copy DOI

Abstract

The detection of differential item functioning is crucial for the psychometric evaluation of multistage tests. This paper discusses five approaches presented in the literature: logistic regression, SIBTEST, analytical score-based tests, bootstrap score-based tests, and permutation score-based tests. First, using an simulation study inspired by a real-life large-scale educational assessment, we compare the five approaches with respect to their type I error rate and their statistical power. Then, we present an application to an empirical data set. We find that all approaches show type I error rates close to the nominal alpha level. Furthermore, all approaches are shown to be sensitive to uniform and non-uniform DIF effects, with the score-based tests showing the highest power.

Full Text