DSISA: A New Neural Machine Translation Combining Dependency Weight and Neighbors

Lingfang Li,Aijun Zhang,Ming-Xing Luo

doi:10.1145/3638762

Abstract

Most of the previous neural machine translations (NMT) rely on parallel corpus. Integrating explicitly prior syntactic structure information can improve the neural machine translation. In this article, we propose a Syntax Induced Self-Attention (SISA) which explores the influence of dependence relation between words through the attention mechanism and fine-tunes the attention allocation of the sentence through the obtained dependency weight. We present a new model, Double Syntax Induced Self-Attention (DSISA), which fuses the features extracted by SISA and a compact convolution neural network (CNN). SISA can alleviate long dependency in sentence, while CNN captures the limited context based on neighbors. DSISA utilizes two different neural networks to extract different features for richer semantic representation and replaces the first layer of Transformer encoder. DSISA not only makes use of the global feature of tokens in sentences but also the local feature formed with adjacent tokens. Finally, we perform simulation experiments that verify the performance of the new model on standard corpora.

Full Text