Abstract

Automated essay scoring (AES) is gaining increasing attention in the education sector as it significantly reduces the burden of manual scoring and allows ad hoc feedback for learners. Natural language processing based on machine learning has been shown to be particularly suitable for text classification and AES. While many machine-learning approaches for AES still rely on a bag of words (BOW) approach, we consider a transformer-based approach in this paper, compare its performance to a logistic regression model based on the BOW approach, and discuss their differences. The analysis is based on 2088 email responses to a problem-solving task that were manually labeled in terms of politeness. Both transformer models considered in the analysis outperformed without any hyperparameter tuning of the regression-based model. We argue that, for AES tasks such as politeness classification, the transformer-based approach has significant advantages, while a BOW approach suffers from not taking word order into account and reducing the words to their stem. Further, we show how such models can help increase the accuracy of human raters, and we provide a detailed instruction on how to implement transformer-based models for one’s own purposes.

Highlights

  • Recent developments in natural language processing (NLP) and the progress in machine learning (ML) algorithms have opened the door to new approaches within the educational sector in general and the measurement of student performance, in particular

  • long short-term memory (LSTM) model significantly outperforms the two baseline models, based on support vector regression and Bayesian linear ridge regression, while outperforming models based on just the LSTM, the CNN, or a gated recurrent units (GRU)

  • They show that the transformer-based approaches yield results comparable to that of a model combined of LSTM and CNN

Read more

Summary

Introduction

Recent developments in natural language processing (NLP) and the progress in machine learning (ML) algorithms have opened the door to new approaches within the educational sector in general and the measurement of student performance, in particular. Intelligent tutoring systems, plagiarism-detecting software, or helpful chatbots are just a few examples of how ML is currently used to support learners and teachers [1]. An important part of providing personalized feedback and supporting students is automated essay scoring (AES), in which algorithms are implemented to classify long text answers in accordance with classifications by human raters [2]. We implement AES using current state of the art language models based on neural networks with a transformer architecture [3,4]. We want to explore the following two main questions: Licensee MDPI, Basel, Switzerland.

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.