Abstract

We report work on speech enhancement that combines sequential noise estimation and perceptual filtering. The sequential estimation employs an extension of the sequential EM-type algorithm. In the algorithm, statistics of clean speech are modeled by hidden Markov models (HMM) and noise is assumed to be Gaussian distributed with a time-varying mean vector (the noise parameter) to be estimated. The estimation process uses a non-linear function that relates speech statistics, noise, and noisy observation. With the estimated noise parameter, the subtraction-type algorithm for speech enhancement may be extended to non-stationary environments. In particular, a perceptual filter with frequency masking is constructed with a tradeoff between noise reduction and speech distortion considering the sensitivity of speech recognition systems to speech distortion. Our experiments in speech enhancement and speech recognition in non-stationary noise confirmed that this approach seems promising in improving performances compared to alternative speech enhancement algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call