Abstract

The attention mechanism in Neural Machine Translation (NMT) models added flexibility to translation systems, and the possibility to visualize soft-alignments between source and target representations. While there is much debate about the relationship between attention and the yielded output for neural models (Jain and Wallace 2019; Serrano and Smith 2019; Wiegreffe and Pinter 2019; Vashishth et al. 2019), in this paper we propose a different assessment, investigating soft-alignment interpretability in low-resource scenarios. We experimented with different architectures (RNN (Bahdanau et al. 2015), 2D-CNN (Elbayad et al. 2018), and Transformer (Vaswani et al. 2017)), comparing them with regards to their ability to produce directly exploitable alignments. For evaluating exploitability, we replicated the Unsupervised Word Segmentation (UWS) task from Godard et al. (2018). There, source words are translated into unsegmented phone sequences. Posterior to training, the resulting soft-alignments are used for producing segmentation over the target side. Our results showed that a RNN-based NMT model produced the most exploitable alignments in this scenario. We then investigated methods for increasing its UWS scores by comparing the following methodologies: monolingual pre-training, input representation augmentation (hybrid model), and explicit word length optimization during training. We reached the best results by using the hybrid model, which uses an intermediate monolingual-rooted segmentation from a non-parametric Bayesian model (Goldwater 2007) to enrich the input representation before training.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call