A Cepstrum Domain HMM-Based Speech Enhancement Method Applied to Non-Stationary Noise

Mikael Nilsson,Ingvar Claesson,Mattias Dahl

doi:10.1007/0-387-22928-0_1

Abstract

This paper presents a Hidden Markov Model (HMM)-based speech enhancement method, aiming at reducing non-stationary noise from speech signals. The system is based on the assumption that the speech and the noise are additive and uncorrelated. Cepstral features are used to extract statistical information from both the speech and the noise. A-priori statistical information is collected from long training sequences into ergodic hidden Markov models. Given the ergodic models for the speech and the noise, a compensated speech-noise model is created by means of parallel model combination, using a log-normal approximation. During the compensation, the mean of every mixture in the speech and noise model is stored. The stored means are then used in the enhancement process to create the most likely speech and noise power spectral distributions using the forward algorithm combined with mixture probability. The distributions are used to generate a Wiener filter for every observation. The paper includes a performance evaluation of the speech enhancer for stationary as well as non-stationary noise environment.

Full Text