Grammatical Error Correction as GAN-like Sequence Labeling

Kevin Parnow,Zuchao Li,Hai Zhao

doi:10.18653/v1/2021.findings-acl.290

Abstract

In Grammatical Error Correction (GEC), sequence labeling models enjoy fast inference compared to sequence-to-sequence models; however, inference in sequence labeling GEC models is an iterative process, as sentences are passed to the model for multiple rounds of correction, which exposes the model to sentences with progressively fewer errors at each round. Traditional GEC models learn from sentences with fixed error rates. Coupling this with the iterative correction process causes a mismatch between training and inference that affects final performance. In order to address this mismatch, we propose a GAN-like sequence labeling model, which consists of a grammatical error detector as a discriminator and a grammatical error labeler with Gumbel-Softmax sampling as a generator. By sampling from real error distributions, our errors are more genuine compared to traditional synthesized GEC errors, thus alleviating the aforementioned mismatch and allowing for better training. Our results on several evaluation benchmarks demonstrate that our proposed approach is effective and improves the previous state-of-the-art baseline.

Highlights

Sequence-to-sequence neural solutions (Parnow et al, 2020) have been quite successful in comparison to their statistical counterparts (Sutskever et al, 2014), but these approaches suffer from a couple key problems, which has given rise to sequence labeling approaches for Grammatical Error Correction (GEC) (Omelianchuk et al., 2020)
To combat this exposure bias, we propose a new approach for training a sequence labeling GEC model that draws from GANs (Goodfellow et al, 2014), which consist of a generator that generates increasingly realistic fake inputs and a discriminator that is tasked with differentiating these fake inputs from real inputs
The corrective label set is given as T = {$KEP, $DEL, $APP, $REP} ∪ {$CAS, $MRG, $SPL, $NNUM, $VFORM}, in which the first set consists of the basic text editing transformation operations and the second consists of g-transformations as defined by (Omelianchuk et al, 2020) for GEC1

Summary

Introduction

Sequence-to-sequence neural solutions (Parnow et al, 2020) have been quite successful in comparison to their statistical counterparts (Sutskever et al, 2014), but these approaches suffer from a couple key problems, which has given rise to sequence labeling approaches for GEC (Omelianchuk et al., 2020) Such approaches task models with generating a list of labels to classify the grammatical errors in a sentence before correcting these errors. The corrective label set is given as T = {$KEP, $DEL, $APP, $REP} ∪ {$CAS, $MRG, $SPL, $NNUM, $VFORM}, in which the first set consists of the basic text editing transformation operations and the second consists of g-transformations as defined by (Omelianchuk et al, 2020) for GEC1 Aligning sentences using these transformations in preprocessing, reduces what would be a sequence generation task that handles unequal source-target lengths to a set of label classification problems.

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Grammatical Error Correction as GAN-like Sequence Labeling

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2021
Citations: 4	License type: cc-by

Similar Papers

A Hybrid System for Chinese Grammatical Error Diagnosis and Correction
Chen Li ... Junpei Zhou
-
Chen Li, et. al.Chen Li ... Junpei Zhou
01 Jan 2018
01 Jan 2018

Neural Quality Estimation with Multiple Hypotheses for Grammatical Error Correction

-

25 May 2021
25 May 2021

Optimizing the impact of data augmentation for low-resource grammatical error correction
Aiman Solyman ... Lubna Abdelkareim Gabralla
Journal of King Saud University - Computer and Information Sciences | VOL. 35
Aiman Solyman, et. al.Aiman Solyman ... Lubna Abdelkareim Gabralla
09 May 2023
Journal of King Saud University - Computer and Information Sciences | VOL. 35

Cross-Corpora Evaluation and Analysis of Grammatical Error Correction Models — Is Single-Corpus Evaluation Enough?
Masato Mita ... Ryo Nagata
-
Masato Mita, et. al.Masato Mita ... Ryo Nagata
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Grammatical Error Correction as GAN-like Sequence Labeling

Abstract

Highlights

Summary

Talk to us

Similar Papers