Enhancing Automated Essay Scoring Performance via Fine-tuning Pre-trained Language Models with Combination of Regression and Ranking

Ruosong Yang,Zhiyuan Wen,Jiannong Cao,Xiaodong He,Youzheng Wu

doi:10.18653/v1/2020.findings-emnlp.141

Abstract

Automated Essay Scoring (AES) is a critical text regression task that automatically assigns scores to essays based on their writing quality. Recently, the performance of sentence prediction tasks has been largely improved by using Pre-trained Language Models via fusing representations from different layers, constructing an auxiliary sentence, using multi-task learning, etc. However, to solve the AES task, previous works utilize shallow neural networks to learn essay representations and constrain calculated scores with regression loss or ranking loss, respectively. Since shallow neural networks trained on limited samples show poor performance to capture deep semantic of texts. And without an accurate scoring function, ranking loss and regression loss measures two different aspects of the calculated scores. To improve AES’s performance, we find a new way to fine-tune pre-trained language models with multiple losses of the same task. In this paper, we propose to utilize a pre-trained language model to learn text representations first. With scores calculated from the representations, mean square error loss and the batch-wise ListNet loss with dynamic weights constrain the scores simultaneously. We utilize Quadratic Weighted Kappa to evaluate our model on the Automated Student Assessment Prize dataset. Our model outperforms not only state-of-the-art neural models near 3 percent but also the latest statistic model. Especially on the two narrative prompts, our model performs much better than all other state-of-the-art models.

Highlights

Automated Essay Scoring (AES) automatically evaluates the writing quality of essays
Before introducing our new way to use pre-trained language models, we briefly review existing works in AES firstly
With the measurement of Quadratic Weighted Kappa (QWK), our model outperforms state-of-the-art neural models on average QWK score of all eight prompts near 3 percent and performs better than the latest statistical model

Summary

Introduction

Automated Essay Scoring (AES) automatically evaluates the writing quality of essays. Essay assignments evaluation costs lots of time. Have shown the extraordinary ability of representation and generalization These models have gained better performance in lots of downstream tasks such as text classification and regression. Sun et al (2019b) summarized several fine-tuning methods, including fusing text representations from different layers, utilizing multi-task learning, etc. Existing works utilize different methods to learn text representations and constrain scores, which are the two key steps in AES models. One is how to learn better essay representations to evaluate the writing quality, the other one is how to learn a more accurate score mapping function. We propose a new method called multi-loss to fine-tune BERT models in AES tasks. To show the effectiveness of self-attention in the BERT model, we illustrate the weights of different words on two examples, including one argumentative essay and one narrative essay

Related Works

R2BERT

Self-attention

Feature Extraction

Regression

Batchwise Learning to Rank Model

Combination of Regression and Ranking

Experiment

Dataset

Experiment Settings

Evaluation Metric

Average

Baselines and Implementation Details

Experiment Results and Analysis

Runtime and Memory

Conclusion and Future works

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Enhancing Automated Essay Scoring Performance via Fine-tuning Pre-trained Language Models with Combination of Regression and Ranking

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2020
Citations: 43	License type: cc-by

Similar Papers

A Hierarchical BERT-Based Transfer Learning Approach for Multi-Dimensional Essay Scoring
Liyan Zheng ... Xiaoyi Tang
IEEE Access | VOL. 9
Liyan Zheng, et. al.Liyan Zheng ... Xiaoyi Tang
01 Jan 2020
IEEE Access | VOL. 9

Automated Chinese Essay Scoring using Pre-Trained Language Models
Lin Li ... Yeling Liang
-
Lin Li, et. al.Lin Li ... Yeling Liang
27 Nov 2021
27 Nov 2021

Towards an Enhanced Understanding of Bias in Pre-trained Neural Language Models: A Survey with Special Emphasis on Affective Bias
Manjary P Gangan ... Deepak P
-
Manjary P Gangan, et. al. Manjary P Gangan ... Deepak P
01 Jan 2021
01 Jan 2021

Neural Transfer Learning For Vietnamese Sentiment Analysis Using Pre-trained Contextual Language Models
Duy V Huynh ... Thanh-Van Le
-
Duy V Huynh, et. al.Duy V Huynh ... Thanh-Van Le
16 Dec 2021
16 Dec 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Enhancing Automated Essay Scoring Performance via Fine-tuning Pre-trained Language Models with Combination of Regression and Ranking

Abstract

Highlights

Summary

Talk to us

Similar Papers