Abstract

To deal with large lexica (more than 2000 words) automatic speech recognition systems (ASR) use an internal phonetic representation of the speech signal and phonemic models of pronunciation from the lexicon to search for the spoken word chain or sentence. Therefore it is possible to model different pronunciations of a word in the lexicon. In German we observed that individual speakers pronounce words in a typical way that depends on several factors as sex, age, place of living, place of birth, etc. Our goal is to enhance speech recognition by automatically adapting the models of pronunciation in the lexicon to the unknown speaker. The obvious problem is: You cannot wait until the present speaker has uttered approximately 2000 different words at least once. We solved this problem by generalization of observed rules of differing pronunciation to words not yet observed. Another method presented in this paper is speaker adaptation by re-estimating the a posteriori probabilities of the phonetic units used in a “bottom up” ASR system. A word hypothesis is evaluated by the product of the a posteriori probabilities of the phonetic units produced by the classification to the phonetic units belonging to the word hypothesis. Normally these probabilities are estimated during the training of the ASR system and stay fixed during the test. We propose an algorithm which observes the typical confusions of phonetic units of the unknown speaker and adapts the a posteriori probabilities continuously.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.