Abstract
Statistical phrase-based machine translation models crucially rely on word alignments. The search for word-alignments assumes a model of word locality between source and target languages that is violated in starkly different word-order languages such as English-Hindi. In this article, we present models that decouple the steps of lexical selection and lexical reordering with the aim of minimizing the role of word-alignment in machine translation. Indian languages are morphologically rich and have relatively free-word order where the grammatical role of content words is largely determined by their case markers and not just by their positions in the sentence. Hence, lexical selection plays a far greater role than lexical reordering. For lexical selection, we investigate models that take the entire source sentence into account and evaluate their performance for English-Hindi translation in a tourism domain.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have