Anchor Point Selection: Scale Alignment Based on an Inequality Criterion.

Carolin Strobl,Julia Kopf,Achim Zeileis,Timo von Oertzen,Lucas Kohler

doi:10.1177/0146621621990743

Abstract

For detecting differential item functioning (DIF) between two or more groups of test takers in the Rasch model, their item parameters need to be placed on the same scale. Typically this is done by means of choosing a set of so-called anchor items based on statistical tests or heuristics. Here the authors suggest an alternative strategy: By means of an inequality criterion from economics, the Gini Index, the item parameters are shifted to an optimal position where the item parameter estimates of the groups best overlap. Several toy examples, extensive simulation studies, and two empirical application examples are presented to illustrate the properties of the Gini Index as an anchor point selection criterion and compare its properties to those of the criterion used in the alignment approach of Asparouhov and Muthén. In particular, the authors show that—in addition to the globally optimal position for the anchor point—the criterion plot contains valuable additional information and may help discover unaccounted DIF-inducing multidimensionality. They further provide mathematical results that enable an efficient sparse grid optimization and make it feasible to extend the approach, for example, to multiple group scenarios.

Highlights

One of the major advantages of probabilistic test theory is that its assumptions are empirically testable
In the following the authors will report the false alarm rate, that is computed as the percentage of items that were simulated as differential item functioning (DIF) free, but erroneously show a significant test result, and the hit rate, that is computed as the percentage of items that were simulated to have DIF and correctly show a significant test result
A new approach has been suggested for placing the item parameter estimates of a Rasch model for two groups of test takers on the same scale

Summary

Introduction

One of the major advantages of probabilistic test theory is that its assumptions are empirically testable. The authors will show how selecting c according to the Gini Index leads to shifts between the two groups that makes their item parameters well comparable This approach can serve as the basis for any kind of graphical display as well as for formal DIF tests. Note that the application of this criterion in this framework, based on CML estimation for the Rasch model, means that certain properties of Asparouhov and Muthen’s approach, which was originally described for a two-parameter model and for optimizing means and variances, may not carry over (in particular any effects of DIF affecting group variances) Concentrating on this simple case allows us to concentrate on some fundamental properties of the criteria and compare the results to the extensive existing literature on DIF testing in the Rasch model.

Results

Summary and Discussion