Abstract

With increased use of constructed response items in large scale assessments, the cost of scoring has been a major consideration (Noh et al. in KICE Report RRE 2012-6, 2012; Wainer and Thissen in Applied Measurement in Education 6:103–118, 1993). In response to the scoring cost issues, various forms of automated system for scoring constructed response items have been developed and used. The purpose of this research is to provide a comprehensive analysis for the generalizability of automated scoring results and compare it to that of scores produced by human raters. The results of this study provide evidence supporting the argument that the automated scoring system offers outcomes nearly as reliable as those produced by human scoring. Based on these findings, the automated scoring system appears to be a promising alternative to human scoring particularly for short factual answer items.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call