Abstract
This paper studies the Type I error rate obtained using the Breslow-Day (BD) test to detect Nonuniform Differential Item Functioning (NUDIF) in a short test when the average ability of one group is significantly higher than that of the other. The performance is compared with the logistic regression (LR) and the standard Mantel-Haenszel procedure (MH). Responses to a 20-item test were simulated without Differential Item Functioning (DIF) according to the three-parameter logistic model. The manipulated factors were sample size and item parameters. The design yielded 40 conditions that were replicated 50 times and the false positive rate at a 5% significance level obtained with the three methods was recorded for each condition. In most cases, BD performed better than LR and MH in terms of proneness to Type I error. With the BD test, the Type I error rate was similar to the nominal one when the item with the highest discrimination and difficulty parameters in the case of equally sized groups was excluded from the goodness-of-fit to the binomial distribution (number of false positives among the fifty replications of a Bernoulli variable with parameter equal to 0.05).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.