Abstract

Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters’ experience, language background, and the choice of essay prompt as potential sources of biases. Eight raters, four native English speakers and four Persian L1 speakers of English as a Foreign Language (EFL), scored 40 essays on one general and one field-specific topic. The raters assessed these essays based on Test of English as a Foreign Language (TOEFL) holistic and International English Language Testing System (IELTS) analytic band scores. Multifaceted Rasch Measurement (MFRM) was run to find extant biases. In spite of not finding statistically significant biases, several interesting results emerged illustrating the influence of construct-irrelevant factors such as raters’ experience, L1, and educational background. Further research is warranted to investigate these factors as potential sources of rater bias.

Highlights

  • Finding sources of bias in language tests has perhaps been one of the most critical issues so far under scrutiny by researchers, testers, and teachers around the globe

  • The goals sought by this research are to lessen or remove the biases stemming from these issues using Multifaceted Rasch Measurement (MFRM) to find any interaction of raters’ main effects and essay prompt

  • Considering the importance of further research in such factors as raters’ experience, language background, and the effect of essay prompt in writing assessment, we attempt to find answers to the following research questions: Research Question 1: Can the raters’ experience be a cause of bias in assessing essays holistically and analytically? Research Question 2: Are English native and L1-Persian raters severe or lenient in writing assessment using analytic versus holistic rating scales? Research Question 3: Does the essay prompt introduce bias into raters’ assessment of writings?

Read more

Summary

Introduction

Finding sources of bias in language tests has perhaps been one of the most critical issues so far under scrutiny by researchers, testers, and teachers around the globe. Working on the Test of German as a Foreign Langauge (TestDaf), Eckes (2005) examined rater severity, bias/interaction of raters, examinees, rating criteria, and gender. 18 ratings of raters were put to MFRM through which criterion-related bias measures were estimated He reached the point that “criteria perceived as highly important were more closely associated with severe ratings, and criteria perceived as less important were more closely associated with lenient ratings” The term experienced raters met these criteria: Having master of arts or education degree, being either graduate or English as a Second Language (ESL) instructors, having been involved in teaching and rating ESL writing for a minimum of 5 years, having undergone special assessment training, and considering themselves as competent or expert raters. Considering the importance of further research in such factors as raters’ experience, language background, and the effect of essay prompt in writing assessment, we attempt to find answers to the following research questions: Research Question 1: Can the raters’ experience be a cause of bias in assessing essays holistically and analytically? Research Question 2: Are English native and L1-Persian raters severe or lenient in writing assessment using analytic versus holistic rating scales? Research Question 3: Does the essay prompt introduce bias into raters’ assessment of writings?

Participants
Procedure
Design
Discussion and Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call