Gaussian Mixture Clustering and Language Adaptation for the Development of a New Language Speech Recognition System

Nikos Chatzichrisafis,Costas Harizakis,Vassilios Digalakis,Vassilios Diakoloukas

doi:10.1109/tasl.2006.885259

Abstract

The porting of a speech recognition system to a new language is usually a time-consuming and expensive process since it requires collecting, transcribing, and processing a large amount of language-specific training sentences. This work presents techniques for improved cross-language transfer of speech recognition systems to new target languages. Such techniques are particularly useful for target languages where minimal amounts of training data are available. We describe a novel method to produce a language-independent system by combining acoustic models from a number of source languages. This intermediate language-independent acoustic model is used to bootstrap a target-language system by applying language adaptation. For our experiments, we use acoustic models of seven source languages to develop a target Greek acoustic model. We show that our technique significantly outperforms a system trained from scratch when less than 8 h of read speech is available

Full Text