Abstract
Empirical studies demonstrated Type‐I error (TIE) inflation (especially for highly discriminating easy items) of the Mantel‐Haenszel chi‐square test for differential item functioning (DIF), when data conformed to item response theory (IRT) models more complex than Rasch, and when IRT proficiency distributions differed only in means. However, no published study manipulated proficiency variance ratio (VR). Data were generated with the three‐parameter logistic (3PL) IRT model. Proficiency VRs were 1, 2, 3, and 4. The present study suggests inflation may be greater, and may affect all highly discriminating items (low, moderate, and high difficulty), when IRT proficiency distributions of reference and focal groups differ also in variances. Inflation was greatest on the 21‐item test (vs. 41) and 2,000 total sample size (vs. 1,000). Previous studies had not systematically examined sample size ratio. Sample size ratio of 1:1 produced greater TIE inflation than 3:1, but primarily for total sample size of 2,000.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have