Aligning words using matrix factorisation

Cyril Goutte,Eric Gaussier,Kenji Yamada

doi:10.3115/1218955.1219019

Aligning words using matrix factorisation

Cyril Goutte, Eric Gaussier + Show 1 more

Open Access

https://doi.org/10.3115/1218955.1219019

Copy DOI

Publication Date: Jan 1, 2004

Citations: 42

Affiliation: Xerox (France)

#Bilingual Terminology Extraction #Mutual Translations + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Aligning words from sentences which are mutual translations is an important problem in different settings, such as bilingual terminology extraction, Machine Translation, or projection of linguistic features. Here, we view word alignment as matrix factorisation. In order to produce proper alignments, we show that factors must satisfy a number of constraints such as orthogonality. We then propose an algorithm for orthogonal non-negative matrix factorisation, based on a probabilistic model of the alignment data, and apply it to word alignment. This is illustrated on a French-English alignment task from the Hansard.

Full Text