Abstract

As methods for automated scoring of constructed‐response items become more widely adopted in state assessments, and are used in more consequential operational configurations, it is critical that their susceptibility to gaming behavior be investigated and managed. This article provides a review of research relevant to how construct‐irrelevant response behavior may affect automated constructed‐response scoring, and aims to address a gap in that literature: the need to assess the degree of risk before operational launch. A general framework is proposed for evaluating susceptibility to gaming, and an initial empirical demonstration is presented using the open‐source short‐answer scoring engines from the Automated Student Assessment Prize (ASAP) Challenge.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call