Abstract
We propose Seq2Edits, an open-vocabulary approach to sequence editing for natural language processing (NLP) tasks with a high degree of overlap between input and output texts. In this approach, each sequence-to-sequence transduction is represented as a sequence of edit operations, where each operation either replaces an entire source span with target tokens or keeps it unchanged. We evaluate our method on five NLP tasks (text normalization, sentence fusion, sentence splitting & rephrasing, text simplification, and grammatical error correction) and report competitive results across the board. For grammatical error correction, our method speeds up inference by up to 5.2x compared to full sequence models because inference time depends on the number of edits rather than the number of target tokens. For text normalization, sentence fusion, and grammatical error correction, our approach improves explainability by associating each edit operation with a human-readable tag.
Highlights
Neural models that generate a target sequence conditioned on a source sequence were initially proposed for machine translation (MT) (Sutskever et al, 2014; Kalchbrenner and Blunsom, 2013; Bahdanau et al, 2015; Vaswani et al, 2017), but are used widely as a central component of a variety of natural language processing (NLP) systems (e.g. Tan et al (2017); Chollampatt and Ng (2018))
The number of iterations in task-specific training is set empirically based on the performance on the development set
We have presented a neural model that represents sequence transduction using span-based edit operations
Summary
Neural models that generate a target sequence conditioned on a source sequence were initially proposed for machine translation (MT) (Sutskever et al, 2014; Kalchbrenner and Blunsom, 2013; Bahdanau et al, 2015; Vaswani et al, 2017), but are used widely as a central component of a variety of NLP systems (e.g. Tan et al (2017); Chollampatt and Ng (2018)). Raffel et al (2019) argue that even problems that are traditionally not viewed from a sequence transduction perspective can benefit from massive pre-training when framed as a text-to-text problem. For many NLP tasks such as correcting grammatical errors in a sentence, the input and output sequence may overlap significantly. Employing a full sequence model in these cases is often wasteful as most tokens are copied over from the input to the output. Another disadvantage of a full sequence model is that it does not provide an explanation for why it proposes a particular target sequence. We apply our edit operation based model to five NLP tasks: text normalization, sentence fusion, sentence splitting & rephrasing, text simplification, and grammatical error correction (GEC). Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pages 5147–5159, November 16–20, 2020. c 2020 Association for Computational Linguistics
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.