Enhancing low-resource neural machine translation with syntax-graph guided self-attention

Junjun Guo,Yan Li,Shengxiang Gao,Zhengtao Yu,Longchao Gong

doi:10.1016/j.knosys.2022.108615

Abstract

Most neural machine translation (NMT) models only rely on parallel sentence pairs, while the performance drops sharply in low-resource cases, as the models fail to mine the linguistry of the corpus. Incorporating prior monolingual knowledge explicitly, such as syntax, has been shown to be effective for NMT, particularly in low-resource scenarios. However, existing approaches have not exploited the full potential of the NMT architectures. In this paper, we present syntax-graph guided self-attention (SGSA): a neural network model that combines the source-side syntactic knowledge with multi-head self-attention. We introduce an additional syntax-aware localness modeling as a bias, which indicates that the syntactically relevant parts need to be paid more attention to. The bias is then incorporated into the original attention distribution to form a revised distribution. Moreover, to maintain the strength of capturing the meaningful semantic representations of source-sentence, we adopt a node random dropping strategy in multi-head self-attention subnetworks. Extensive experiments on several standard small-scale datasets demonstrate that SGSA can significantly improve the performance of Transformer-based NMT, and is also superior to the previous syntax-dependent state-of-the-art.

Full Text