Context-aware positional representation for self-attention networks

Kehai Chen,Rui Wang,Masao Utiyama,Eiichiro Sumita

doi:10.1016/j.neucom.2021.04.055

Abstract

In self-attention networks (SANs), positional embeddings are used to model order dependencies between words in the input sentence and are added with word embeddings to gain an input representation, which enables the SAN-based neural model to perform (multi-head) and to stack (multi-layer) self-attentive functions in parallel to learn the representation of the input sentence. However, this input representation only involves static order dependencies based on discrete position indexes of words, that is, is independent of context information, which may be weak in modeling the input sentence. To address this issue, we proposed a novel positional representation method to model order dependencies based on n-gram context or sentence context in the input sentence, which allows SANs to learn a more effective sentence representation. To validate the effectiveness of the proposed method, it is applied to the neural machine translation model, which adopts a typical SAN-based neural model. Experimental results on two widely used translation tasks, i.e., WMT14 English-to-German and WMT17 Chinese-to-English, showed that the proposed approach can significantly improve the translation performance over the strong Transformer baseline.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Context-aware positional representation for self-attention networks

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Journal: Neurocomputing	Publication Date: Apr 21, 2021
Citations: 4

Similar Papers

Recurrent Positional Embedding for Neural Machine Translation
Kehai Chen ... Rui Wang
-
Kehai Chen, et. al.Kehai Chen ... Rui Wang
01 Jan 2019
01 Jan 2019

Towards More Diverse Input Representation for Neural Machine Translation
Kehai Chen ... Eiichiro Sumita
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 28
Kehai Chen, et. al.Kehai Chen ... Eiichiro Sumita
01 Jan 2020
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 28

Self-Attention with Cross-Lingual Position Representation
Liang Ding ... Dacheng Tao
-
Liang Ding, et. al.Liang Ding ... Dacheng Tao
01 Jan 2020
01 Jan 2020

Naive Regularizers for Low-Resource Neural Machine Translation
Meriem Beloucif ... Marcel Bollmann
-
Meriem Beloucif, et. al.Meriem Beloucif ... Marcel Bollmann
22 Oct 2019
22 Oct 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Context-aware positional representation for self-attention networks

Abstract

Talk to us

Similar Papers

More From: Neurocomputing