Abstract
MMSE filtering of speech with additive noise and latent speech power-spectral density (PSD) is addressed. This problem is strong in single-channel speech enhancement and restricts the utility of stationary Wiener filters or other statistical estimators based on PSDs. The issue typically manifests itself in residual noise after filtering, despite the availability of the noise PSD. Our paper therefore incorporates the latent speech PSD state via marginalization into the MMSE estimation framework of complex speech spectral amplitudes. The hence involved joint posterior distribution of the complex speech amplitude and speech PSD, conditioned on just the noisy observations, is then resolved in the Bayesian sense into a speech and a speech-PSD posterior. The latter is expressed via the local data likelihood and a hyper-prior of the local speech PSD or a-priori SNR—i.e., a global distribution across the entire speech signal. Marginalization, in this way, turns into expectation over a latent Wiener filter, such that explicit estimation of local a-priori SNR is eliminated. The local input data in the form of the a-posteriori SNR and the global SNR value as a descriptor of the overall speech-in-noise condition turns out sufficient to control our resulting MMSE spectral gain function, and, potentially, can be provided much easier than the latent and time-varying a-priori SNR. An improved balance of residual noise and speech quality in the enhancement of noisy speech is demonstrated by objective experimental evaluation.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.