TMU Transformer System Using BERT for Re-ranking at BEA 2019 Grammatical Error Correction on Restricted Track

Masahiro Kaneko,Satoru Katsumata,Kengo Hotate,Mamoru Komachi

doi:10.18653/v1/w19-4422

Masahiro Kaneko, Satoru Katsumata + Show 2 more

Open Access

https://doi.org/10.18653/v1/w19-4422

Copy DOI

Publication Date: Jan 1, 2019
Citations: 27	License type: cc-by

Affiliation: Tokyo Metropolitan University, Osaka University

Abstract

We introduce our system that is submitted to the restricted track of the BEA 2019 shared task on grammatical error correction1 (GEC). It is essential to select an appropriate hypothesis sentence from the candidates list generated by the GEC model. A re-ranker can evaluate the naturalness of a corrected sentence using language models trained on large corpora. On the other hand, these language models and language representations do not explicitly take into account the grammatical errors written by learners. Thus, it is not straightforward to utilize language representations trained from a large corpus, such as Bidirectional Encoder Representations from Transformers (BERT), in a form suitable for the learner’s grammatical errors. Therefore, we propose to fine-tune BERT on learner corpora with grammatical errors for re-ranking. The experimental results of the W&I+LOCNESS development dataset demonstrate that re-ranking using BERT can effectively improve the correction performance.

Highlights

Grammatical error correction (GEC) systems may be used for language learning to detect and correct grammatical errors in text written by language learners
We used First Certificate in English (FCE), Lang-8, NUCLE, and Write & Improve (W&I)+LOCNESS training set as training data and we split the W&I+LOCNESS development set into development and test data by random selection from each Common European Framework of Reference for Languages (CEFR) levels for the transformer and Bidirectional Encoder Representations from Transformers (BERT)
By using BERT based on self-attention for re-ranking, which is effective for long distance information, our system became better at solving long distance errors; there is a room for improvement

Summary

Introduction

Grammatical error correction (GEC) systems may be used for language learning to detect and correct grammatical errors in text written by language learners. GEC has grown in importance over the past few years due to the increasing need for people to learn new languages. GEC has been addressed in the Helping Our Own (HOO) (Dale and Kilgarriff, 2011; Dale et al, 2012) and Conference on Natural Language Learning (CoNLL) (Ng et al, 2013, 2014) shared tasks between 2011 and 2014. There are three main types of neural network models for GEC, namely, recurrent neural networks (Ge et al, 2018), a multi-layer convolutional model based on convolutional neural networks (Chollampatt and Ng, 2018a) and a transformer model based on self-attention (JunczysDowmunt et al, 2018). We follow the best practices to develop our system based on the transformer model, which has achieved better performance for GEC (Zhao et al, 2019)

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

TMU Transformer System Using BERT for Re-ranking at BEA 2019 Grammatical Error Correction on Restricted Track

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Bidirectional encoders to state-of-the-art: a review of BERT and its transformative impact on natural language processing
Rajesh Gupta
Информатика. Экономика. Управление - Informatics. Economics. Management | VOL. 3
Rajesh GuptaRajesh Gupta
02 Mar 2024
Информатика. Экономика. Управление - Informatics. Economics. Management | VOL. 3

Oversampling effect in pretraining for bidirectional encoder representations from transformers (BERT) to localize medical BERT and enhance biomedical BERT
Shoya Wada ... Yasushi Matsumura
Artificial Intelligence In Medicine | VOL. 153
Shoya Wada, et. al.Shoya Wada ... Yasushi Matsumura
05 May 2024
Artificial Intelligence In Medicine | VOL. 153

Bidirectional Encoder Representations from Transformers (BERT) Language Model for Sentiment Analysis task: Review

-

19 Apr 2021
19 Apr 2021

T-BERT:臺灣語言模型–以臺灣在地語言預訓練BERT模型

-

01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

TMU Transformer System Using BERT for Re-ranking at BEA 2019 Grammatical Error Correction on Restricted Track

Abstract

Highlights

Summary

Talk to us

Similar Papers