Abstract

We present a maximum likelihood (ML) stochastic matching approach to decrease the acoustic mismatch between a test utterance Y and a given set of speech hidden Markov models /spl Lambda//sub X/ so as to reduce the recognition performance degradation caused by possible distortions in the test utterance. This mismatch may be reduced in two ways: (1) by an inverse distortion function F/sub /spl nu//(.) that maps Y into an utterance X which matches better with the models /spl Lambda//sub X/, and (2) by a model transformation function G/sub /spl eta//(.) that maps /spl Lambda//sub X/ to the transformed model /spl Lambda//sub Y/ which matches better with the utterance Y. The functional form of the transformations depends upon our prior knowledge about the mismatch, and the parameters are estimated along with the recognized string in a maximum likelihood manner using the EM algorithm. Experimental results verify the efficacy of the approach in improving the performance of a continuous speech recognition system in the presence of mismatch due to different transducers and transmission channels.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call