This study examines how second/foreign language (L2) word difficulty estimates derived from item response theory (IRT) and classical test theory (CTT) frameworks are virtually identical in the context of vocabulary testing. This conclusion is reached via a two-stage process: (a) psychometric assessments of both approaches and (b) L2 word difficulty modelling with lexical sophistication. Using data collected from first/native language (L1) Japanese EFL learners, both approaches led to similar conclusions in terms of the psychometric properties of the construct. Furthermore, CTT- and IRT/Rasch-derived estimates for word difficulty yielded nearly identical results in the predictive models. Although the “CTT-vs-IRT” debates in the past few decades have concluded with a middle ground agreed upon in most educational settings, this study acts as a useful demonstration to L2 vocabulary researchers who appear to rely heavily on Rasch/IRT analysis. The findings have practical applications to the area of L2 word difficulty research. Namely, that Rasch (an IRT procedure) alone might not be sufficient for validation, although it could be preferable when conducting inferential testing because of its potential utility for reducing chances of Type II errors.
Read full abstract