Abstract

In this article we present results of a Differential Item Functioning (DIF) study using Shealy and Stout's (1993) multidimensionality-based DIF analysis framework. In this framework, differences in test score distributions across different groups of examinees may be a result of multidimensionality if secondary dimensions (not the primary dimension measured by the test) differentially affect examinee performance. Thus, this framework requires both statistical and substantive judgments for identifying potential DIF items and substantiating the causes of DIF, which, as a result, will enhance a comprehensive construct validity argument. In this article, we illustrate step-by-step procedures of multidimensionality-based DIF analyses using LanguEdge reading comprehension test data. Qualitative data from think-aloud verbal protocols were used to generate DIF hypotheses about differential functioning of vocabulary items between two groups of Indo-European and non-Indo-European L2 learners. Statistical Simultaneous Item Bias Test (SIBTEST; Shealy & Stout, 1993) was used to test the DIF hypotheses. The DIF results supported the hypotheses by flagging four uniform DIF items and one crossing DIF item. Post-hoc analyses of the DIF-flagged items were performed by visual inspection of group differences using TestGraf (Ramsay, 2001) and revisiting the qualitative verbal data and analyses of cognate types. The results showed that DIF items with large effect sizes were associated with the following: (a) translation-equivalent cognates; (b) word meaning determined independent of context, and (c) less frequent, more difficult distracter words than those in stems. In light of empirical evidence, the article discusses implications for test development and validation processes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call