A scale purification procedure for evaluation of differential item functioning

Muhammad Naveed Khalid,Cees A.W Glas

doi:10.1016/j.measurement.2013.12.019

Muhammad Naveed Khalid, Cees A.W Glas

Open Access

https://doi.org/10.1016/j.measurement.2013.12.019

Copy DOI

Journal: Measurement	Publication Date: Jan 8, 2014
Citations: 7	License type: public-domain

Affiliation: Cambridge University Press, University of Cambridge

Abstract

Item bias or differential item functioning (DIF) has an important impact on the fairness of psychological and educational testing. In this paper, DIF is seen as a lack of fit to an item response (IRT) model. Inferences about the presence and importance of DIF require a process of so-called test purification where items with DIF are identified using statistical tests and DIF is modeled using group-specific item parameters. In the present study, DIF is identified using item-oriented Lagrange multiplier statistics. The first problem addressed is that the dependence of these statistics might cause problems in the presence of a relatively large number DIF items. Therefore, a stepwise procedure is proposed where DIF items are identified one or two at a time. Simulation studies are presented to illustrate the power and Type I error rate of the procedure. The second problem pertains to the importance of DIF, i.e., the effect size, and related problem of defining a stopping rule for the searching procedure for DIF. The estimate of the difference between the means and variances of the ability distributions of the studied groups of respondents is used as an effect size and the purification procedure is stopped when the change in this effect size becomes negligible.

Full Text