Abstract
In this study, differential item functioning (DIF) detection performances of multiple indicators, multiple causes (MIMIC) and logistic regression (LR) methods for dichotomous data were investigated. Performances of these two methods were compared by calculating the Type I error rates and power for each simulation condition. Conditions covered in the study were: sample size (2000 and 4000 respondents), ability distribution of focal group [N(0, 1) and N(-0.5, 1)], and the percentage of items with DIF (10% and 20%). Ability distributions of the respondents in the reference group [N(0, 1)], ratio of focal group to reference group (1:1), test length (30 items), and variation in difficulty parameters between groups for the items that contain DIF (0.6) were the conditions that were held constant. When the two methods were compared according to their Type I error rates, it was concluded that the change in sample size was more effective for MIMIC method. On the other hand, the change in the percentage of items with DIF was more effective for LR. When the two methods were compared according to their power, the most effective variable for both methods was the sample size.
Highlights
Test items may be biased since they may contain constructs that are undesired to be measured along with the desired ones
The main finding of this study was that the sample size was an important factor in Differential item functioning (DIF) analyses conducted with MIMIC and logistic regression (LR) methods
For the MIMIC method, while the lowest rate was calculated under the condition where the sample size was 4000, percentage of items with DIF was 10%, and the ability distribution of both groups showed a standard normal distribution N(0, 1), the highest rate was calculated under the condition where the sample size was 2000, percentage of items with DIF was 20%, and the ability distribution of both groups showed a standard normal distribution N(0, 1)
Summary
Test items may be biased since they may contain constructs that are undesired to be measured along with the desired ones. Any item may be in relation with a second or more factors other than the one which is of interest. Those factors that are irrelevant to the construct being measured may affect the performances of individuals. This issue is known as test bias. Item bias is related to a specific item. Differential item functioning (DIF), which is a statistical method used in item bias analysis, has been the subject of a vast majority of recent studies (Zumbo, 1999)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have