Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement

Alireza Mohammadshahi,James Henderson

doi:10.1162/tacl_a_00358

Abstract

We propose the Recursive Non-autoregressive Graph-to-Graph Transformer architecture (RNGTr) for the iterative refinement of arbitrary graphs through the recursive application of a non-autoregressive Graph-to-Graph Transformer and apply it to syntactic dependency parsing. We demonstrate the power and effectiveness of RNGTr on several dependency corpora, using a refinement model pre-trained with BERT. We also introduce Syntactic Transformer (SynTr), a non-recursive parser similar to our refinement model. RNGTr can improve the accuracy of a variety of initial parsers on 13 languages from the Universal Dependencies Treebanks, English and Chinese Penn Treebanks, and the German CoNLL2009 corpus, even improving over the new state-of-the-art results achieved by SynTr, significantly improving the state-of-the-art for all corpora tested.

Highlights

Self-attention models, such as Transformer (Vaswani et al, 2017), have been hugely successful in a wide range of natural language processing (NLP) tasks, especially when combined with language-model pre-training, such as BERT (Devlin et al, 2019)
After some initial experiments to determine the maximum number of refinement iterations, we report the performance of the RNG Transformer model on the Universal Dependency (UD) Treebanks, Penn Treebanks, and German CoNLL 2009 Treebank
The Syntactic Transformer (SynTr) model significantly outperforms the UDify model, so the errors are harder to correct by adding the RNGTr model (2.67% for SynTr versus 15.01% for UDify of relative error reduction in LAS after integration)

Summary

Introduction

Self-attention models, such as Transformer (Vaswani et al, 2017), have been hugely successful in a wide range of natural language processing (NLP) tasks, especially when combined with language-model pre-training, such as BERT (Devlin et al, 2019). These architectures contain a stack of self-attention layers that can capture long-range dependencies over the input sequence, while still representing its sequential order using absolute position encodings. This parsing model predicts one edge of the parse graph at a time, conditioning on the graph of previous edges, so it is an autoregressive model

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Transactions of the Association for Computational Linguistics	Publication Date: Aug 9, 2020
Citations: 47	License type: cc-by

R Discovery Prime

R Discovery Prime

Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Transactions of the Association for Computational Linguistics

Lead the way for us

Similar Papers

A Universal Dependencies Corpora Maintenance Methodology Using Downstream Application
Ran Iwamoto ... Alexandre Rademaker
-
Ran Iwamoto, et. al.Ran Iwamoto ... Alexandre Rademaker
01 Jan 2020
01 Jan 2020

Evaluating Universal Dependency Parser Recovery of Predicate Argument Structure via CompChain Analysis
...
-
, et. al. ...
22 Jul 2021
22 Jul 2021

Evaluating Universal Dependency Parser Recovery of Predicate Argument Structure via CompChain Analysis
Sagar Indurkhya ... Robert C Berwick
-
Sagar Indurkhya, et. al.Sagar Indurkhya ... Robert C Berwick
01 Jan 2020
01 Jan 2020

Extracting Valences from a Dependency Treebank for Populating the Verb Lexicon of a Portuguese HPSG Grammar
Leonel Figueiredo De Alencar ... Ana Luiza Nunes
-
Leonel Figueiredo De Alencar, et. al.Leonel Figueiredo De Alencar ... Ana Luiza Nunes
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Transactions of the Association for Computational Linguistics