Abstract

This study investigates how experienced and inexperienced raters score essays written by ESL students on two different prompts. The quantitative analysis using multi-faceted Rasch measurement, which provides measurements of rater severity and consistency, showed that the inexperienced raters were more severe than the experienced raters on one prompt but not on the other prompt, and that differences between the two groups of raters were eliminated following rater training. The qualitative analysis, which consisted of analysis of raters' think-aloud protocols while scoring essays, provided insights into reasons for these differences. Differences were related to the ease with which the scoring rubric could be applied to the two prompts and to differences in how the two groups of raters perceived the appropriateness of the prompts.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call