Exploring the Role of Artificial Intelligence in Facilitating Assessment of Writing Performance in Second Language Learning

Zilu Jiang,Kui Xie,Zexin Xu,Jingwen He,Zilong Pan

doi:10.3390/languages8040247

Zilu Jiang, Kui Xie + Show 3 more

Open Access

PDF Available

https://doi.org/10.3390/languages8040247

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

This study examined the robustness and efficiency of four large language models (LLMs), GPT-4, GPT-3.5, iFLYTEK and Baidu Cloud, in assessing the writing accuracy of the Chinese language. Writing samples were collected from students in an online high school Chinese language learning program in the US. The official APIs of the LLMs were utilized to conduct analyses at both the T-unit and sentence levels. Performance metrics were employed to evaluate the LLMs’ performance. The LLM results were compared to human rating results. Content analysis was conducted to categorize error types and highlight the discrepancies between human and LLM ratings. Additionally, the efficiency of each model was evaluated. The results indicate that GPT models and iFLYTEK achieved similar accuracy scores, with GPT-4 excelling in precision. These findings provide insights into the potential of LLMs in supporting the assessment of writing accuracy for language learners.

Full Text