Abstract

AbstractThis paper proposes a technique for supervised speaker adaptation for speech recognition based on a continuous density hidden Markov model (HMM). When the number of utterances for the adaptation words is decreased in order to reduce the burden on the user, the ratio of demisyllables (recognition units) not appearing in the uttered words is increased. In the proposed method, the HMM parameters for the demisyllables contained in the adaptation data are corrected and then the HMM parameters for the demisyllables not contained in the adaptation data are corrected by interpolation in the parameter space (spectral interpolation). To avoid a biased estimation of the parameters that depend on the adaptation data set, a correction is made based on large‐scale uttered data by a large number of speakers. A word recognition experiment, executed for 100 similar words, simulates a 5000‐word large‐vocabulary speech recognition; and the proposed method is then evaluated. When the recognition rate by the speaker‐independent HMM reaches 81.2 percent, the recognition rate is improved up to 85.2 percent by using 50 words for adaptation. Thus, the effectiveness of the proposed method is verified.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.