Abstract

We present a fully Bayesian hierarchical approach for multichannel speech enhancement with time-varying audio channel. Our probabilistic approach relies on a Gaussian prior for the speech signal and a Gamma hyperprior for the speech precision, combined with a multichannel linear-Gaussian state-space model for the acoustic channel. Furthermore, we assume a Wishart prior for the noise precision matrix. We derive a variational expectation-maximization VEM algorithm that uses a variant of a multichannel Wiener filter MCWF to infer the sound source and a Kalman smoother to infer the acoustic channel. It is further shown that the VEM speech estimator can be recasted as a multichannel minimum variance distortionless response MVDR beamformer followed by a single-channel variational postfilter. The proposed algorithm was evaluated using both simulated and real room environments with several noise types and reverberation levels. Both static and dynamic scenarios are considered. In terms of speech quality, it is shown that a significant improvement is obtained with respect to the noisy signal, and that the proposed method outperforms a baseline algorithm. In terms of channel alignment and tracking ability, a superior channel estimate is demonstrated.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call