Abstract

Assessing differential item functioning (DIF) using the ordinal logistic regression (OLR) model highly depends on the asymptotic sampling distribution of the maximum likelihood (ML) estimators. The ML estimation method, which is often used to estimate the parameters of the OLR model for DIF detection, may be substantially biased with small samples. This study is aimed at proposing a new application of the elastic net regularized OLR model, as a special type of machine learning method, for assessing DIF between two groups with small samples. Accordingly, a simulation study was conducted to compare the powers and type I error rates of the regularized and nonregularized OLR models in detecting DIF under various conditions including moderate and severe magnitudes of DIF (DIF = 0.4 and 0.8), sample size (N), sample size ratio (R), scale length (I), and weighting parameter (w). The simulation results revealed that for I = 5 and regardless of R, the elastic net regularized OLR model with w = 0.1, as compared with the nonregularized OLR model, increased the power of detecting moderate uniform DIF (DIF = 0.4) approximately 35% and 21% for N = 100 and 150, respectively. Moreover, for I = 10 and severe uniform DIF (DIF = 0.8), the average power of the elastic net regularized OLR model with 0.03 ≤ w ≤ 0.06, as compared with the nonregularized OLR model, increased approximately 29.3% and 11.2% for N = 100 and 150, respectively. In these cases, the type I error rates of the regularized and nonregularized OLR models were below or close to the nominal level of 0.05. In general, this simulation study showed that the elastic net regularized OLR model outperformed the nonregularized OLR model especially in extremely small sample size groups. Furthermore, the present research provided a guideline and some recommendations for researchers who conduct DIF studies with small sample sizes.

Highlights

  • In psychometric research such as health-related quality of life (HRQoL), measurement invariance, known as differential item functioning (DIF), is a prerequisite assumption for the valid comparison of HRQoL scores across people from different subgroups

  • Note: DIF: differential item functioning; I: number of items in the scale; J: number of response categories; LASSO: least absolute shrinkage and selection operator; λ: regularization parameter; OLR: ordinal logistic regression; w: weighting parameter; Ratio: sample size ratio between the focal and reference groups; nf and nr indicate the sample sizes in the focal and reference groups, respectively; N: the total sample size (N=nf +nr). ∗These λ values were obtained according to the Bayesian information criterion (BIC)

  • Note: DIF: differential item functioning; I: number of items in the scale; J: number of response categories; LASSO: least absolute shrinkage and selection operator; OLR: ordinal logistic regression; w: weighting parameter; Ratio: sample size ratio between the focal and reference groups; nf and nr indicate sample sizes in the focal and reference groups, respectively; N: total sample size (N=nf +nr). ∗These λ values were obtained according to the Bayesian information criterion (BIC)

Read more

Summary

Introduction

In psychometric research such as health-related quality of life (HRQoL), measurement invariance, known as differential item functioning (DIF), is a prerequisite assumption for the valid comparison of HRQoL scores across people from different subgroups (e.g., groups distinguished by gender, age, race. or health conditions). In psychometric research such as health-related quality of life (HRQoL), measurement invariance, known as differential item functioning (DIF), is a prerequisite assumption for the valid comparison of HRQoL scores across people from different subgroups DIF occurs when individuals from different groups respond differently to specific items in a questionnaire after controlling the construct being measured [1, 2]. The OLR model can evaluate both uniform and nonuniform DIF and can control other categorical and continuous variables which may affect the results of DIF analysis [3, 4]. Uniform DIF occurs when the difference in item response probabilities remains constant across complete construct domains, whereas nonuniform DIF is evident when the direction of DIF differs across various parts of the construct scale [5, 6].

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call