Abstract

This paper presents three different methods for developing multilingual phone models for flexible speech recognition tasks. The main goal of our investigations is to find multilingual speech units that work equally well in many languages. With such a universal set it is possible to build speech recognition systems for a variety of languages. One advantage of this approach is that acoustic–phonetic parameters in a HMM-based speech recognition system can then be shared. The multilingual approach starts with the phone sets of six languages, a total of 232 language-dependent and context-independent phone models. Then, we develop three different methods to map the language-dependent models to a multilingual phone set. The first method is a direct mapping to the phone set of the International Phonetic Association (IPA). In the second approach we apply an automatic clustering algorithm for the phone models. The third method exploits the similarities of single mixture components of the language-dependent models. Like the first method the language-specific models are mapped to the IPA inventory. In the second step an agglomerative clustering is performed on the density level to find regions of similarity between the phone models of different languages. The experiments carried out with the SpeechDat(M) database, show that the third method yields almost the same recognition rate as language-dependent models. However, using this method we achieve a huge reduction of the number of densities in the multilingual system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call