Abstract

We investigate the enhancement of speech corrupted by unknown independent additive noise when only a single microphone is available. We present adaptive enhancement systems based on an existing non-adaptive technique [Ephraim, Y., 19992a. IEEE Transactions on Signal Processing 40 (4), 725–735]. This approach models the speech and noise statistics using autoregressive hidden Markov models (AR-HMMs). We develop two main extensions. The first estimates the noise statistics from detected pauses. The second forms maximum likelihood (ML) estimates of the unknown noise parameters using the whole utterance. Both techniques operate within the AR-HMM framework. We have previously shown that the ability of AR-HMMs to model speech can be improved by the incorporation of perceptual frequency using the bilinear transform. We incorporate this improvement into our enhancement systems. We evaluate our techniques on the NOISEX-92 and Resource Management (RM) databases, giving indications of performance on simple and more complex tasks, respectively. Both enhancement schemes proposed are able to improve substantially on baseline results. The technique of forming ML estimates of the noise parameters is found to be the most effective. Its performance is evaluated over a wide range of noise conditions ranging from −6 to 18 dB and on various types of stationary real-world noises.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call