Abstract

Abstract This research is motivated by the expectation that automated scoring will play an increasingly important role in high stakes educational testing. Therefore, approaches to safeguard the validity of score interpretation under automated scoring should be investigated. This investigation illustrates one approach to study the vulnerability of a scoring engine to construct-irrelevant response strategies (CIRS) based on the substitution of more sophisticated words. That approach is illustrated and evaluated by simulating the effect of a specific strategy with real essays. The results suggest that the strategy had modest effects, although it was effective in improving the scores of a fraction of the lower-scoring essays. The broader implications of the results for quality assurance and control of automated scoring engines are discussed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call