Development and validation of a rating scale for Iranian EFL academic writing assessment: a mixed-methods study

Nasim Ghanbari,Hossein Barati

doi:10.1186/s40468-020-00112-3

Abstract

The present study reports the process of development and validation of a rating scale in the Iranian EFL academic writing assessment context. To achieve this goal, the study was conducted in three distinct phases. Early in the study, the researcher interviewed a number of raters in different universities. Next, a questionnaire was developed based on the results of the interview along with the related literature. Later, the questionnaire was sent to thirty experienced raters from ten major state universities in Iran. Results of the country-wide survey in this phase showed that there was no objective scale in use by the raters in the context. Therefore, in the second development phase of the study, fifteen of the raters who participated in the first phase were asked to verbalize their thoughts when each rating five essays. At the end of this phase, a first draft of the scale was developed. Finally, in the last validation phase of the study, ten raters were asked to each rate a body of twenty essays using the newly developed scale. Next, eight of the raters participated in a follow-up retrospective interview. The analysis of the raters’ performance using FACETS showed high profile of reliability and validity for the new scale. In addition, while the qualitative findings of the interviews counted some problems with the structure of the scale, on the whole, the findings showed that the introduction of the scale was well-received by the raters. The pedagogical implications of the study will be discussed. In addition, the study calls for further validation of the scale in the context.

Highlights

In performance-based assessment, scoring rubrics are important as they show the construct to be performed and measured
Investigating the first research question Through careful investigation of the items which explored the existence of a rating scale among the Iranian English as a foreign language (EFL) raters, it was found that the raters who contributed to this study doubted the existence of an objective rating scale in their rating practice
While a substantial number of raters disagreed with an impressionistic approach to scoring (56.66%, item 15) and strongly believed that all raters had some criteria in their scoring (80%, item 17), they held differing attitudes about a common rating scale in their own rating practice

Summary

Introduction

In performance-based assessment, scoring rubrics (variously named as rating scales or marking schemes) are important as they show the construct to be performed and measured. 2010, p.43) and can reduce the long-recognized problem of rater variability (Bachman, Lynch, & Mason, 1995; McNamara, 1996). As a result, they promote reliable test scores and valid score inferences (Boettger, 2010; Crusan, 2010; Crusan, 2015; Dempsey, PytlikZillig, & Bruning, 2009; Knoch, 2009; Lukácsi, 2020; Rakedzon & Baram-Tsabari, 2017). McNamara (1996, p.182) asserts that in the field of language assessment, “we are frequently presented with rating scales as products for consumption and are told little of their provenance and of their rationale. The ad hoc rubrics developed in this way can help teachers with the classroom assessment, for more high-stakes tests with significant impacts on the educational life of stake-holders, the rubrics grounded in theory are needed (Knoch, 2011; McNamara, 1996)

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Language Testing in Asia	Publication Date: Nov 11, 2020
Citations: 5	License type: open-access

R Discovery Prime

Development and validation of a rating scale for Iranian EFL academic writing assessment: a mixed-methods study

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Language Testing in Asia

Lead the way for us

Similar Papers

Development of an Instrument to Measure Self-Efficacy for Social Participation of People With Mental Illness
Manami Amagai ... Jack Tsai
Archives of Psychiatric Nursing | VOL. 26
Manami Amagai, et. al.Manami Amagai ... Jack Tsai
12 Nov 2011
Archives of Psychiatric Nursing | VOL. 26

Turkish Validity and Reliability of the Reproductive Coercion Scale
Ruşen Öztürk ... Özlem Güner
Turkish Journal of Family Medicine and Primary Care | VOL. 15
Ruşen Öztürk, et. al.Ruşen Öztürk ... Özlem Güner
09 Mar 2021
Turkish Journal of Family Medicine and Primary Care | VOL. 15

Development and validation of preconception care improvement scale (PCIS) in a resource-limited setting
Firanbon Teshome ... Yohannes Kebede
BMC Pregnancy and Childbirth | VOL. 22
Firanbon Teshome, et. al.Firanbon Teshome ... Yohannes Kebede
11 Jan 2022
BMC Pregnancy and Childbirth | VOL. 22

A Study on the Reliability and Validity of the Social Media Leadership Scale in Education Organizations Sample
Deniz Görgülü ... Mustafa Demir
Kastamonu Eğitim Dergisi | VOL. -
Deniz Görgülü, et. al.Deniz Görgülü ... Mustafa Demir
24 Oct 2024
Kastamonu Eğitim Dergisi | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Development and validation of a rating scale for Iranian EFL academic writing assessment: a mixed-methods study

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Language Testing in Asia