Head-Transducer Models for Speech Translation and TheirAutomatic Acquisition from Bilingual Data

Hiyan Alshawi,Shona Douglas,Srinivas Bangalore

doi:10.1023/a:1011187330969

Abstract

This article presents statistical language translation models, called ``dependency transduction models'', based on collections of ``head transducers''. Head transducers are middle-out finite-state transducers which translate a head word in a source string into its corresponding head in the target language, and further translate sequences of dependents of the source head into sequences of dependents of the target head. The models are intended to capture the lexical sensitivity of direct statistical translation models, while at the same time taking account of the hierarchical phrasal structure of language. Head transducers are suitable for direct recursive lexical translation, and are simple enough to be trained fully automatically. We present a method for fully automatic training of dependency transduction models for which the only input is transcribed and translated speech utterances. The method has been applied to create English–Spanish and English–Japanese translation models for speech translation applications. The dependency transduction model gives around 75% accuracy for an English–Spanish translation task (using a simple string edit-distance measure) and 70% for an English–Japanese translation task. Enhanced with target n-grams and a case-based component, English–Spanish accuracy is over 76%; for English–Japanese it is 73% for transcribed speech, and 60% for translation from recognition word lattices.

Full Text