ChatGPT, an innovative large language model that has impressed worldwide audiences with its exceptional generative capabilities, is now positioned to significantly transform the field of education. The purpose of this exploratory study is to investigate how accurately ChatGPT generates feedback on the content and organization components of EFL compare and contrast essays and the extent to which the feedback length provided by ChatGPT differs from that of the human teacher.To address these questions, a ChatGPT prompt incorporating evaluation criteria for content and organization components was developed, generating feedback on 10 compare and contrast student essays using the ChatGPT 3.5 version. The ChatGPT feedback and teacher feedback were assessed quantitatively and qualitatively according to the predetermined evaluation criteria. Furthermore, two types of feedback were compared descriptively and by conducting the Wilcoxon Sign Rank Test. The findings revealed that ChatGPT produced highly accurate feedback for both content and organization components, surpassing the teacher in the length of feedback provided. While the accuracy rate of the generated feedback was high, issues such as holistic assessment of the essay, false positives, failure to provide feedback where needed, and discrepancies in the depth of feedback compared to teacher feedback were identified. The results suggest that while ChatGPT shows promise in providing educational feedback, teacher-AI collaboration in giving feedback for EFL compare and contrast essays is important for delivering feedback that optimally benefits learners.
Read full abstract