Abstract

Power spectral density (PSD) estimates of various microphone signal components are essential to many speech enhancement procedures. As speech is highly non-nonstationary, performance improvements may be gained by maintaining time-variations in PSD estimates. In this paper, we propose an instantaneous PSD estimation approach based on generalized principal components. Similarly to other eigenspace-based PSD estimation approaches, we rely on recursive averaging in order to obtain a microphone signal correlation matrix estimate to be decomposed. However, instead of estimating the PSDs directly from the temporally smooth generalized eigenvalues of this matrix, yielding temporally smooth PSD estimates, we propose to estimate the PSDs from newly defined instantaneous generalized eigenvalues, yielding instantaneous PSD estimates. The instantaneous generalized eigenvalues are defined from the generalized principal components, i.e. a generalized eigenvector-based transform of the microphone signals. We further show that the smooth generalized eigenvalues can be understood as a recursive average of the instantaneous generalized eigenvalues. Simulation results comparing the multi-channel Wiener filter (MWF) with smooth and instantaneous PSD estimates indicate better speech enhancement performance for the latter. A MATLAB implementation is available online.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.