A Hierarchical BERT-Based Transfer Learning Approach for Multi-Dimensional Essay Scoring

Jin Xue,Liyan Zheng,Xiaoyi Tang

doi:10.1109/access.2021.3110683

Abstract

The task of automated essay scoring (AES) continues to attract interdisciplinary attention due to its commercial and educational importance as well as related research challenges. Traditional AES approaches rely on handcrafted features, which are time-consuming and labor-intensive. Neural network approaches have recently given fantastic results in AES without feature engineering, but they usually require extensive annotated data. Moreover, most of the existing AES models only report a single holistic score without providing diagnostic information about various dimensions of writing quality. Focusing on these issues, we develop a novel approach using multi-task learning (MTL) with fine-tuning Bidirectional Encoder Representations from Transformers (BERT) for multi-dimensional AES tasks. As a state-of-the-art pre-trained language model, a BERT-based approach can improve AES tasks with limited training data. Meanwhile, we deal with long texts by proposing a hierarchical method and using the attention mechanism to automatically determine the contribution of different fractions of the input essay to the final score. For the multi-topic essay scoring tasks on the ASAP dataset, results reveal that our approach outperforms the average quadratic weighted Kappa (QWK) score by 4.5% compared with the strong baseline. We propose a self-collected dataset of C hinese E FL L earners’ A rgumentation (CELA) to provide valuable information about writing quality from multiple rating dimensions, including holistic and five analytic scales. For the multi-rating dimensional essay scoring tasks on the CELA dataset, experimental results demonstrate that our model increases the average QWK score by 8.1% compared with the strong baseline.

Highlights

The task of automated essay scoring (AES) draws interdisciplinary interest in linguistics [1], [2], education [3]–[5] and natural language processing (NLP) [6]–[8]
We developed a novel method using multi-task learning with fine-tuning Bidirectional Encoder Representations from Transformers (BERT) for multi-dimensional essay scoring tasks
The motivation of our study is to develop a novel approach using multi-task learning with fine-tuning BERT for multitopic scoring tasks on the Automated Student Assessment Prize (ASAP) dataset and multi-rating dimensional tasks on the self-collected Chinese EFL Learners’ Argumentation (CELA) dataset

Summary

Introduction

The task of automated essay scoring (AES) draws interdisciplinary interest in linguistics [1], [2], education [3]–[5] and natural language processing (NLP) [6]–[8]. The disadvantages of the first subtype are that features must be manually chosen to fit the model and that extra effort is required to perform effectively on various tasks. Attempts to solve this dilemma have resulted in the development of the neural network approach. As the most advanced pre-trained language model [14], Bidirectional Encoder Representations from Transformers (BERT) [15] is based on a multi-layer bidirectional transformer, and has achieved fruitful results in a variety of language-based tasks. Little research has been conducted to utilize the pre-trained language model BERT for AES tasks

Methods

Results

Conclusion