The extensive use of automated speech scoring in large-scale speaking assessment can be revolutionary not only to test design and rating, but also to the learning and instruction of speaking based on how students and teachers perceive and react to this technology. However, its washback remained underexplored. This mixed-method study aimed to investigate the washback of TOEFL iBT Speaking’s SpeechRater on Chinese EFL learners through questionnaire and interviews, and explore its associations with test performance and its multi-levelled influential factors. The participants received a mixture of positive and negative washback, such as their motivated individual learning through personal devices, decreasing real-life communicative practices, and increasing exam-driven behaviours. Test takers’ personal understandings of automated speech scoring were found directly influential to the washback of SpeechRater that they experienced. Furthermore, their test scores of TOEFL iBT Speaking were positively correlated with the implicit washback of SpeechRater on their learning but uncorrelated with its explicit washback on their test preparation. The findings have been drawn on to propose a washback model of automated speech scoring and make suggestions to test designers, teachers and learners on how to boost its positive washback and mitigate its negative washback. This research has concluded the importance of test takers’ awareness of the integrated dimensions to evaluate spoken English in real-life use. Accordingly, instructional implications are discussed on how teachers can guide the students to utilize automated speech scoring in learning and set up comprehensive learning goals for spoken English.
Read full abstract