Abstract

Many well-known and frequently employed Bayesian clean speech estimators have been derived under the assumption that the true power spectral densities (PSDs) of speech and noise are exactly known. In practice, however, only power spectral density (PSD) estimates are available. Simply neglecting PSD estimation errors and handling the estimates as true values leads to speech estimation errors causing musical noise and undesired suppression of speech. In this paper, the uncertainty of the available speech PSD estimates is addressed. The main contributions are the following. First, we summarize and examine ways to model and incorporate the uncertainty of PSD estimates for a more robust speech enhancement performance. Second, a novel nonlinear clean speech estimator is derived that takes into account prior knowledge about the absolute value of typical speech PSDs. Third, we show that the derived statistical framework provides uncertainty-aware counterparts to a number of well-known conventional clean speech estimators such as the Wiener filter and Ephraim and Malah's amplitude estimators. Fourth, we show how modern PSD estimators can be incorporated into the theoretical framework and propose to employ frequency dependent priors. Finally, the effects and benefits of considering the uncertainty of speech PSD estimates are analyzed, discussed, and evaluated via instrumental measures and a listening experiment.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.