Accurate modeling and estimation of speech and noise gains facilitate good performance of speech enhancement methods using data-driven prior models. In this paper, we propose a hidden Markov model (HMM)-based speech enhancement method using explicit gain modeling. Through the introduction of stochastic gain variables, energy variation in both speech and noise is explicitly modeled in a unified framework. The speech gain models the energy variations of the speech phones, typically due to differences in pronunciation and/or different vocalizations of individual speakers. The noise gain helps to improve the tracking of the time-varying energy of nonstationary noise. The expectation-maximization (EM) algorithm is used to perform offline estimation of the time-invariant model parameters. The time-varying model parameters are estimated online using the recursive EM algorithm. The proposed gain modeling techniques are applied to a novel Bayesian speech estimator, and the performance of the proposed enhancement method is evaluated through objective and subjective tests. The experimental results confirm the advantage of explicit gain modeling, particularly for nonstationary noise sources
Read full abstract