Abstract

Objective. The present study uses simulated data to find what the optimal number of response categories is to achieve adequate power in ordinal logistic regression (OLR) model for differential item functioning (DIF) analysis in psychometric research. Methods. A hypothetical ten-item quality of life scale with three, four, and five response categories was simulated. The power and type I error rates of OLR model for detecting uniform DIF were investigated under different combinations of ability distribution (θ), sample size, sample size ratio, and the magnitude of uniform DIF across reference and focal groups. Results. When θ was distributed identically in the reference and focal groups, increasing the number of response categories from 3 to 5 resulted in an increase of approximately 8% in power of OLR model for detecting uniform DIF. The power of OLR was less than 0.36 when ability distribution in the reference and focal groups was highly skewed to the left and right, respectively. Conclusions. The clearest conclusion from this research is that the minimum number of response categories for DIF analysis using OLR is five. However, the impact of the number of response categories in detecting DIF was lower than might be expected.

Highlights

  • In studies related to quality of life, measurement equivalence is an essential assumption for meaningful comparison of health-related quality of life scores across different populations

  • Our findings show that the power of ordinal logistic regression (OLR) model improved as the number of response categories increased

  • For the moderate magnitude of differential item functioning (DIF) (DIF = 0.5), in conditions 1, 2, 3 in which the same ability distribution assumed in both reference and focal groups, increasing the number of response categories from J = 3 to J = 5 increased the OLR power approximately 10%, 8%, and 6%, when R = 1, 2, and 3, respectively

Read more

Summary

Introduction

In studies related to quality of life, measurement equivalence is an essential assumption for meaningful comparison of health-related quality of life scores across different populations. There are different types of DIF detection methods for Likert-type items including multiplegroup categorical confirmatory factor analysis (MGCFA), item response theory (IRT), and ordinal logistic regression model (OLR) [3,4,5,6,7,8] These methods use different assumptions and procedures to test measurement equivalence; they share conceptual similarities such as having ability to examine both uniform and nonuniform DIF. In this perspective, some researchers have tried to compare various DIF detection methods by focusing on real data. Despite the existence of these stimulation studies, further attention is required to clarify some statistical properties of DIF detection methods under different conditions

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call