Abstract

Abstract The research reported here suggests that raters, when involved in writing assessment, are more concerned with their own criteria to set a basis for their judgment rather than the standards provided by scale descriptors. This study sampled think aloud of eight raters who scored 15 essays in accord with Test of Written English (TWE) holistic scoring guide. Verbal report data indicated that just less than five percent of the statements made by the raters are related to the issues assessed in TWE. These findings background the utility of holistic rating scale descriptors, foregrounding the raters' descriptors-independent judgments.

Highlights

  • This study attempts to see how much of raters’ judgments is the reflection of scoring rubrics and descriptors

  • The flexibility of descriptors and rubrics, as Norton Pierce (1991) has suggested, leaves some rooms for the rater to award the writing with features that are not part of the scoring guide resulting in lower reliability

  • He enumerates the following rating scale criteria that are worthy of consideration: Does the scale capture the essential qualities of the written performance? Do the abilities the scale describes progress in the ways it suggests? Can raters agree on their understanding of the descriptions that define the levels? Can raters distinguish all the band levels clearly and interpret them consistently? Can raters interpret effectively any ‘relative’ language terms for example ‘limited’, ‘reasonable’, and ‘adequate’? Do raters always confine themselves exclusively to the context of the scale? What is the role of re-training examinees in the use of the new rating scale in the rating process?

Read more

Summary

Introduction

This study attempts to see how much of raters’ judgments is the reflection of scoring rubrics and descriptors. Raters’ verbalizations, were transcribed so that we can analyze these texts, looking for the rubrics and descriptors suggested both by the study and what other writing features are introduced by the raters to scoring procedure. The flexibility of descriptors and rubrics, as Norton Pierce (1991) has suggested, leaves some rooms for the rater to award the writing with features that are not part of the scoring guide resulting in lower reliability. This is especially true in holistic scoring where raters experience scoring guides which give them a bonus to include writing features in their assessment that are not specified by the scoring guide. We consider the raters’ idiosyncratic preferences as prior to holistic scoring guides which have a minor role in writing assessment processes

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call