Abstract
Using a recently proposed informed spatial filter, it is possible to effectively and robustly reduce reverberation from speech signals captured in noisy environments using multiple microphones. Late reverberation can be modeled by a diffuse sound field with a time-varying power spectral density (PSD). To attain reverberation reduction using this spatial filter, an accurate estimate of the diffuse sound PSD is required. In this work, a method is proposed to estimate the diffuse sound PSD from a set of reference signals by blocking the direct signal components. By considering multiple plane waves in the signal model to describe the direct sound, the method is suitable in the presence of multiple simultaneously active speakers. The proposed diffuse sound PSD estimator is analyzed and compared to existing estimators. In addition, the performance of the spatial filter computed with the diffuse sound PSD estimate is analyzed using simulated and measured room impulse responses in noisy environments with stationary noise and non-stationary babble noise.
Highlights
In speech communication scenarios, reverberation can degrade the speech quality and, in severe cases, the speech intelligibility [1]
4.5 Comparison to existing diffuse power spectral density (PSD) estimators we evaluate the performance of the proposed diffuse PSD estimator and the three estimators described in Sections 3.1.1–3.1.3, denoted by LRSV, coherence-based signal-todiffuse ratio estimator (CSDRE), and ambient beamformer (ABF), respectively
5 Conclusions We proposed a system for joint dereverberation and noise reduction for multiple simultaneously active desired direct sound plane waves
Summary
Reverberation can degrade the speech quality and, in severe cases, the speech intelligibility [1]. Algorithms of the first class identify the acoustic system and equalize it (cf [1] and the references therein). Given a perfect estimate of the acoustic system described by a finite impulse response, perfect dereverberation can be achieved by applying the multiple input/output inverse theorem [2] (i.e., by applying a multichannel equalizer). This approach is not robust against estimation errors of the acoustic impulse responses. As a consequence, this approach is sensitive to changes in the room and to position
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: EURASIP Journal on Audio, Speech, and Music Processing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.