Abstract

Improving the fairness of ECL listening tests by detecting gender-biased items

Highlights

  • Reliability and validity have been for decades the two main test characteristics language test developers pay the most attention to while designing their tests

  • In order to prove this assumption, the present study aims to examine to what extent ECL listening test items at Common European Framework of Reference for Languages (CEFR) level B2 administrated between February 2018 and December 2019 exhibit differential item functioning towards test-taker groups in terms of

  • The results of the statistical analysis, performed with the MFRM-based software Facets, showed differential item functioning for 13 items, which corresponds to 6.5 percent of the total number of items

Read more

Summary

Introduction

Reliability and validity have been for decades the two main test characteristics language test developers pay the most attention to while designing their tests (cf. Bachman, 1990: 24). The term test fairness has been appearing with increasing frequency in papers, studies, and presentations on the topic of language assessment (e.g., Kane, 2010; Kremmel, 2019; Kunnan, 2000; 2004; 2007; 2014; Stoynoff, 2012). Professional guidelines such as the Code for Fair Testing Practices in Education (Joint Committee on Testing Practices, 2005: 23), the ETS International Principles for the Fairness in Assessments (ETS, 2016: 3–4) and the ALTE Principles of Good Practice (ALTE, 2020: 13) emphasize that test developers should strive to make their tests as fair as possible for candidates of different gender, age, ethnic origin, cultural and language background, and special handicapping conditions and needs. One of the most effective strategies for achieving this goal is to construct bias-free tests

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call