Abstract

A probabilistic mixture mode is described for a frame (the short term spectrum) of speech to be used in speech recognition. Each component of the mixture is regarded as a prototype for the labeling phase of a hidden Markov model based speech recognition system. Since the ambient noise during recognition can differ from that present in the training data, the model is designed for convenient updating in changing noise. Based on the observation that the energy in a frequency band is at any fixed time dominated either by signal energy or by noise energy, the energy is modeled as the larger of the separate energies of signal and noise in the band. Statistical algorithms are given for training this as a hidden variables model. The hidden variables are the prototype identities and the separate signal and noise components. Speech recognition experiments that successfully utilize this model are described. >

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call