Abstract

BackgroundMarking of essays is mainly carried out by human raters who bring in their own subjective and idiosyncratic evaluation criteria, which sometimes lead to discrepancy. This discrepancy may in turn raise issues like reliability and fairness. The current research attempts to explore the evaluation criteria of markers on a national level high stakes examination conducted at 12th grade by three examination boards in the South of Pakistan.MethodsFifteen markers and 30 students participated in the study. For this research, data came from quantitative as well as qualitative sources. Qualitative data came in the form of scores on a set of three essays that all the fifteen markers in the study marked. For the purpose of this study, they weren’t provided with any rating scale as to replicate the current practices. Qualitative data came from semi-structured interviews with the selected markers and short written commentaries by the markers to rationalize their scores on the essays.ResultsMany-facet Rasch model analyses present differences in raters’ consistency of scoring and the severity they exercised. Additionally, an analysis of the interviews and the commentaries written by raters justifying the scores they gave showed that there is a great deal of variability in their assessment criteria in terms of grammar, attitude towards mistakes, handwriting, length, creativity and organization and use of cohesive devices.ConclusionsThe study shows a great deal of variability amongst markers, in their actual scores as well as in the criteria they use to assess English essays. Even they apply the same evaluation criteria, markers differ in the relative weight they give them.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call