Psychometric approaches to analyzing C-tests

David Alpizar,John M Norris,Lixiong Gu,Tongyun Li

doi:10.1177/02655322211062138

Abstract

The C-test is a type of gap-filling test designed to efficiently measure second language proficiency. The typical C-test consists of several short paragraphs with the second half of every second word deleted. The words with deleted parts are considered as items nested within the corresponding paragraph. Given this testlet structure, it is commonly taken for granted that the C-test design may violate the local independence assumption. However, this assumption has not been fully investigated in the C-test research to date, including the evaluation of alternative psychometric models (i.e., unidimensional and multidimensional) to calibrate and score the C-test. This study addressed each of these issues using a large data set of responses to an English-language C-test. First, we examined the local item independence assumption via multidimensional item response theory (IRT) models, Yen’s Q3, and Jackknife Slope Index. Second, we evaluated several IRT models to determine optimal approaches to scoring the C-test. The results support an interpretation of unidimensionality for the C-test items within a paragraph, with only minor evidence of local item dependence. Furthermore, the two-parameter logistic (2PL) IRT model was found to be the most appropriate model for calibrating and scoring the C-test. Implications for designing, scoring, and analyzing C-tests are discussed.

Full Text