Analysis of Eigenvalue Decomposition-Based Late Reverberation Power Spectral Density Estimation

Ina Kodrasi,Simon Doclo

doi:10.1109/taslp.2018.2811184

Abstract

Many speech dereverberation techniques require an estimate of the late reverberation power spectral density (PSD). State-of-the-art multichannel methods for estimating the late reverberation PSD typically rely on first, an estimate of the relative transfer functions (RTFs) of the target signal; second, a model for the spatial coherence matrix of the late reverberation; and finally, an estimate of the reverberant speech or reverberant and noisy speech PSD matrix. The RTFs, the spatial coherence matrix, and the speech PSD matrix are all prone to modeling and estimation errors in practice, with the RTFs being particularly difficult to estimate accurately, especially in highly reverberant and noisy scenarios. Recently, we proposed an eigenvalue decomposition (EVD)-based late reverberation PSD estimator, which does not require an estimate of the RTFs. In this paper, this EVD-based PSD estimator is further analyzed and its estimation accuracy and computational complexity are analytically compared to a state-of-the-art maximum likelihood (ML) based PSD estimator. It is shown that for perfect knowledge of the RTFs, spatial coherence matrix, and reverberant speech PSD matrix, the ML-based and the EVD-based PSD estimates are both equal to the true late reverberation PSD. In addition, it is shown that for erroneous RTFs but perfect knowledge of the spatial coherence matrix and reverberant speech PSD matrix, the ML-based PSD estimate is larger than or equal to the true late reverberation PSD, whereas the EVD-based PSD estimate is obviously still equal to the true late reverberation PSD. Finally, it is shown that when modeling and estimation errors occur in all quantities, the ML-based PSD estimate is larger than or equal to the EVD-based PSD estimate. Simulation results for several realistic acoustic scenarios demonstrate the advantages of using the EVD-based PSD estimator in a multichannel Wiener filter, yielding a significantly better performance than the ML-based PSD estimator.

Full Text