Abstract

To introduce the topic of this chapter, let us consider a test developed with the goal of measuring writing proficiency. Ultimately, the test will be used to produce test scores reflecting each examinee’s level of writing proficiency, and to do so the test must elicit examinee responses to generate evidence of writing proficiency. To this end, the test developer is faced with a decision concerning the type of items or tasks to be used to elicit examinee responses. The test developer may opt to use multiple-choice (MC) items. Although cost-effective and efficient, MC items are indirect measures of writing proficiency and may not yield inferences about an examinee’s writing skills that rise to the same level of validity as those offered by more authentic writing tasks. To overcome limitations of MC items, the test developer may choose to employ constructed-response (CR) items consisting of a series of prompts used to elicit written responses from examinees, and have human-raters score the written responses using scoring rubrics. The resulting test may be comprised entirely of CR items or a combination of CR and MC items. While the CR item format offers a much more authentic assessment context, human-rater scoring suffers from several drawbacks, including inconsistency (e.g., due to rater severity/leniency, fatigue, etc.) and a high expense due to the time and resources required to train raters and conduct the scoring process (see Zhang, 2013). The test developer can avoid the drawbacks of human-rater scoring by using CR items that are scored by a computer-automated scoring engine (referred to as automated scoring hereafter), which applies a set of predefined decision rules to assign a score to a CR item based on particular features of the examinee’s response. Automated scoring has the advantageous properties of being perfectly consistent across examinees and being highly efficient from a resource perspective, but these scores may yield biased estimates of scores assigned by human raters.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call