Applying Logistic Regression to Detect Differential Item Functioning in Multidimensional Data.

Hui-Fang Chen,Kuan-Yu Jin

doi:10.3389/fpsyg.2018.01302

Hui-Fang Chen, Kuan-Yu Jin

Open Access

https://doi.org/10.3389/fpsyg.2018.01302

Copy DOI

Abstract

Conventional differential item functioning (DIF) approaches such as logistic regression (LR) often assume unidimensionality of a scale and match participants in the reference and focal groups based on total scores. However, many educational and psychological assessments are multidimensional by design, and a matching variable using total scores that does not reflect the test structure may not be good practice in multidimensional items for DIF detection. We propose the use of all subscores of a scale in LR and compare its performance with alternative matching methods, including the use of total score and individual subscores. We focused on uniform DIF situation in which 250, 500, or 1,000 participants in each group answered 21 items reflecting two dimensions, and the 21st item was the studied item. Five factors were manipulated in the study: (a) the test structure, (b) numbers of cross-loaded items, (c) group differences in latent abilities, (d) the magnitude of DIF, and (e) group sample size. The results showed that, when the studied item measured a single domain, the conventional LR incorporating total scores as a matching variable yielded inflated false positive rates (FPRs) when two groups differed in one latent ability. The situation worsened when one group had a higher ability in one domain and lower ability in another. The LR using a single subscore as the matching variable performed well in terms of FPRs and true positive rates (TPRs) when two groups did not differ in either one latent ability or differed in one latent ability. However, this approach yielded inflated FPRs when two groups differed in two latent abilities. The proposed LR using two subscores yielded well-controlled FPRs across all conditions and yielded the highest TPRs. When the studied item measured two domains, the use of either the total score or two subscores worked well in the control of FPRs and yielded similar TPRs across conditions, whereas the use of a single subscore resulted in inflated FPRs when two groups differed in one or two latent abilities. In conclusion, we recommend the use of multiple subscores to match subjects in DIF detection for multidimensional data.

Highlights

Differential item functioning (DIF) is commonly assessed to examine the prerequisite of test fairness (Stark et al, 2006) and has become routine practice in large-scale educational assessments such as the Trends in Mathematics and Science Study (TIMSS) and the Programme for International Student Assessment (PISA)
Model 1 performed even more poorly: false positive rates (FPRs) were severely inflated when impacts occurred in both dimensions, regardless of the number of cross-loaded anchor items, and inflation worsened as group sizes increased
Our findings indicate that when no impacts were involved, all models yielded satisfactory FPRs across all conditions, regardless of the number of cross-loaded anchor items or the number of dimensions measured by the item under consideration

Summary

Introduction

Differential item functioning (DIF) is commonly assessed to examine the prerequisite of test fairness (Stark et al, 2006) and has become routine practice in large-scale educational assessments such as the Trends in Mathematics and Science Study (TIMSS) and the Programme for International Student Assessment (PISA). The present study examines (1) the impact of dimensionality in DIF detection when tests are designed to measure two domains and (2) the impact of group mean differences in one or both latent abilities.

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in psychology	Publication Date: Jul 27, 2018
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Applying Logistic Regression to Detect Differential Item Functioning in Multidimensional Data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in psychology

Lead the way for us

Similar Papers

Modified Logistic Regression Approaches to Eliminating the Impact of Response Styles on DIF Detection in Likert-Type Scales.
Hui-Fang Chen ... Kuan-Yu Jin
Frontiers in Psychology | VOL. 8
Hui-Fang Chen, et. al.Hui-Fang Chen ... Kuan-Yu Jin
07 Jul 2017
Frontiers in Psychology | VOL. 8

Examining type I error and power for detection of differential item and testlet functioning
Young-Sun Lee ... Maritsa Toro
Asia Pacific Education Review | VOL. 10
Young-Sun Lee, et. al.Young-Sun Lee ... Maritsa Toro
10 Jun 2009
Asia Pacific Education Review | VOL. 10

Iterative Purification and Effect Size Use With Logistic Regression for Differential Item Functioning Detection
Brian F French ... Susan J Maller
Educational and Psychological Measurement | VOL. 67
Brian F French, et. al.Brian F French ... Susan J Maller
01 Jun 2007
Educational and Psychological Measurement | VOL. 67

A Comparison of Logistic Regression and Analysis of Variance Differential Item Functioning Detection Methods
Marjorie L Whitmore ... Randall E Schumacker
Educational and Psychological Measurement | VOL. 59
Marjorie L Whitmore, et. al.Marjorie L Whitmore ... Randall E Schumacker
01 Dec 1999
Educational and Psychological Measurement | VOL. 59

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Applying Logistic Regression to Detect Differential Item Functioning in Multidimensional Data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in psychology