How invariant and accurate are domain ratings in writing assessment?

Stefanie A Wind,George Engelhard

doi:10.1016/j.asw.2013.09.002

Abstract

Abstract The use of evidence to guide policy and practice in education ( Cooper, Levin, & Campbell, 2009 ) has included an increased emphasis on constructed-response items, such as essays and portfolios. Because assessments that go beyond selected-response items and incorporate constructed-response items are rater-mediated ( Engelhard, 2002 , Engelhard, 2013 ), it is necessary to develop evidence-based indices of quality for the rating processes used to evaluate student performances. This study proposes a set of criteria for evaluating the quality of ratings based on the concepts of measurement invariance and accuracy within the context of a large-scale writing assessment. Two measurement models are used to explore indices of quality for raters and ratings: the first model provides evidence for the invariance of ratings, and the second model provides evidence for rater accuracy. Rating quality is examined within four writing domains from an analytic rubric. Further, this study explores the alignment between indices of rating quality based on these invariance and accuracy models within each of the four domains of writing. Major findings suggest that rating quality varies across analytic rubric domains, and that there is some correspondence between indices of rating quality based on the invariance and accuracy models. Implications for research and practice are discussed.

Full Text