Word sense disambiguation for statistical machine translation

Marine Jacinthe Carpuat

doi:10.14711/thesis-b1029221

Abstract

In this thesis, we show for the first time that lexical semantics modelling is useful in Statistical Machine Translation (SMT). Word Sense Disambiguation (WSD), the task of resolving sense ambiguity to identify the right translation of a word is one of the major challenges faced by language translation systems. If the English word drug translates into French as either drogue (used as a narcotic) or medicament (used as a medicine), then an English-French machine translation system needs to disambiguate every use of drug in order to make the correct translations. Heavy effort has been put in designing and evaluating dedicated WSD models, in particular with the Senseval series of workshops. This is partly motivated by the often unstated assumption that any full translation system, to achieve full performance, will sooner or later have to incorporate individual WSD components. However, in most machine translation architectures, in particular SMT, the WSD problem is typically not explicitly addressed. This paradoxical situation encouraged speculation that recent progress in SMT shows that SMT models are already very good at WSD and that current WSD systems have nothing to offer to state-of-the-art SMT. Going beyond these untested assumptions and speculative claims, we conduct the first direct extensive empirical study of the strengths and weaknesses of WSD and SMT. Using the state-of-the-art HKUST WSD system, we surprisingly show that incorporating WSD predictions in SMT does not help translation quality. Puzzlingly, we also report results suggesting that typical SMT models cannot disambiguate word translations as well as dedicated WSD systems. These seemingly contradictory results lead us to generalize conventional WSD models to incorporate assumptions at least as strong as in state-of-the-art SMT. Specifically, (1) WSD targets are generalized from words to phrases, (2) WSD sense inventories and annotation are learned automatically in the same way as conventional SMT translation lexicons, and (3) WSD models are fully integrated in SMT decoding. Remarkably, the resulting generalized Phrase Sense Disambiguation (PSD) models improve translation quality across four different Chinese-to-English translation tasks, as measured by eight common automatic evaluation metrics. Further analysis reveals that generalization from conventional WSD to PSD is necessary in order to obtain consistent improvements in translation quality.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Word sense disambiguation for statistical machine translation

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Multilingual Neural Translation

-

14 Feb 2020
14 Feb 2020

Word Sense Disambiguation applied to Assamese-Hindi Bilingual Statistical Machine Translation
Anup Kumar Barman ... Amitava Nag
Engineering, Technology & Applied Science Research | VOL. 14
Anup Kumar Barman, et. al.Anup Kumar Barman ... Amitava Nag
08 Feb 2024
Engineering, Technology & Applied Science Research | VOL. 14

Word sense disambiguation vs. statistical machine translation
Marine Carpuat ... Dekai Wu
-
Marine Carpuat, et. al.Marine Carpuat ... Dekai Wu
01 Jan 2004
01 Jan 2004

N-gram-based statistical machine translation versus syntax augmented machine translation
Maxim Khalilov ... José A R Fonollosa
-
Maxim Khalilov, et. al.Maxim Khalilov ... José A R Fonollosa
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Word sense disambiguation for statistical machine translation

Abstract

Talk to us

Similar Papers