Improving Word Sense Disambiguation in Neural Machine Translation with Sense Embeddings

Annette Rios Gonzales,Laura Mascarell,Rico Sennrich

doi:10.18653/v1/w17-4702

Abstract

Word sense disambiguation is necessary in translation because different word senses often have different translations. Neural machine translation models learn different senses of words as part of an end-to-end translation task, and their capability to perform word sense disambiguation has so far not been quantified. We exploit the fact that neural translation models can score arbitrary translations to design a novel cross-lingual word sense disambiguation task that is tailored towards evaluating neural machine translation models. We present a test set of 7,200 lexical ambiguities for German → English, and 6,700 for German → French, and report baseline results. With 70% of lexical ambiguities correctly disambiguated, we find that word sense disambiguation remains a challenging problem for neural machine translation, especially for rare word senses. To improve word sense disambiguation in neural machine translation, we experiment with two methods to integrate sense embeddings. In a first approach we pass sense embeddings as additional input to the neural machine translation system. For the second experiment, we extract lexical chains based on sense embeddings from the document and integrate this information into the NMT model. While a baseline NMT system disambiguates frequent word senses quite reliably, the annotation with both sense labels and lexical chains improves the neural models’ performance on rare word senses.

Highlights

Ambiguous words present a special challenge to machine translation systems: in order to produce a correct sentence in the target language, the system has to decide which meaning is accurate in the given context
We present an evaluation with two basic neural MT systems, trained with Nematus (Sennrich et al, 2017), using byte pair encoding (BPE) on both source and target side (Sennrich et al, 2016b)
This paper introduces a novel lexical decision task for the evaluation of Neural machine translation (NMT) models, and presents test sets for German-English and German-French

Summary

Introduction

Ambiguous words present a special challenge to machine translation systems: in order to produce a correct sentence in the target language, the system has to decide which meaning is accurate in the given context. Errors in lexical choice can lead to wrong or even incomprehensible translations. Several ways of evaluating lexical choice for machine translation have been proposed in previous work. Vickrey et al (2005) evaluate lexical choice in a blank-filling task, where the translation of an ambiguous source word is blanked from the reference translation, and an MT system is tested as to whether it can predict it. In all these tasks, a word-level translation (or set of translations) is defined as the gold label. We propose a more constrained task where an MT system has to select one out of a predefined set of translations

Objectives

Results

Conclusion