Promoting the Knowledge of Source Syntax in Transformer NMT Is Not Needed

Thuong Hai Pham,Dominik Macháček,Ondřej Bojar

doi:10.13053/cys-23-3-3265

Abstract

The utility of linguistic annotation in neural machine translation seemed to had been established in past papers. The experiments were however limited to recurrent sequence-to-sequence architectures and relatively small data settings. We focus on the state-of-the-art Transformer model and use comparably larger corpora. Specifically, we try to promote the knowledge of source-side syntax using multi-task learning either through simple data manipulation techniques or through a dedicated model component. In particular, we train one of Transformer attention heads to produce source-side dependency tree. Overall, our results cast some doubt on the utility of multi-task setups with linguistic information. The data manipulation techniques, recommended in previous works, prove ineffective in large data settings. The treatment of self-attention as dependencies seems much more promising: it helps in translation and reveals that Transformer model can very easily grasp the syntactic structure. An important but curious result is, however, that identical gains are obtained by using trivial "linear trees" instead of true dependencies. The reason for the gain thus may not be coming from the added linguistic knowledge but from some simpler regularizing effect we induced on self-attention matrices.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Promoting the Knowledge of Source Syntax in Transformer NMT Is Not Needed

Abstract

Talk to us

Similar Papers

More From: Computación y Sistemas

Lead the way for us

Journal: Computación y Sistemas	Publication Date: Oct 7, 2019
Citations: 48

Similar Papers

Improving neural machine translation with POS-tag features for low-resource language pairs
Zar Zar Hlaing ... Ponrudee Netisopakul
Heliyon | VOL. 8
Zar Zar Hlaing, et. al.Zar Zar Hlaing ... Ponrudee Netisopakul
01 Aug 2022
Heliyon | VOL. 8

Iterative Training of Unsupervised Neural and Statistical Machine Translation Systems
Benjamin Marie ... Atsushi Fujita
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 19
Benjamin Marie, et. al.Benjamin Marie ... Atsushi Fujita
01 Jun 2020
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 19

TPrune
Jiachen Mao ... Yiran Chen
ACM Transactions on Cyber-Physical Systems | VOL. 5
Jiachen Mao, et. al.Jiachen Mao ... Yiran Chen
15 Apr 2021
ACM Transactions on Cyber-Physical Systems | VOL. 5

Improving the Transformer Translation Model with Back-Translation
Hailiang Wang ... Peng Jin
-
Hailiang Wang, et. al.Hailiang Wang ... Peng Jin
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Promoting the Knowledge of Source Syntax in Transformer NMT Is Not Needed

Abstract

Talk to us

Similar Papers

More From: Computación y Sistemas