Abstract

This study compares two methods of defining groups for the detection of differential item functioning (DIF): (a) pairwise comparisons and (b) composite group comparisons. We aim to emphasize and empirically support the notion that the choice of pairwise versus composite group definitions in DIF is a reflection of how one defines fairness in DIF studies. In this study, a simulation was conducted based on data from a 60-item ACT Mathematics test (ACT; Hanson & Béguin). The unsigned area measure method (Raju) was used as the DIF detection method. An application to operational data was also completed in the study, as well as a comparison of observed Type I error rates and false discovery rates across the two methods of defining groups. Results indicate that the amount of flagged DIF or interpretations about DIF in all conditions were not the same across the two methods, and there may be some benefits to using composite group approaches. The results are discussed in connection to differing definitions of fairness. Recommendations for practice are made.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call