Abstract

Speech dereverberation is an important issue for many real-world speech processing applications. Among the techniques developed, the weighted prediction error (WPE) algorithm has been widely adopted and advanced over the last decade, which blindly cancels out the late reverberation component from the reverberant mixture of microphone signals. In this study, we extend the neural-network-based virtual acoustic channel expansion (VACE) framework for the WPE-based speech dereverberation, a variant of the WPE that we recently proposed to enable the use of dual-channel WPE algorithm in a single-microphone speech dereverberation scenario. Based on the previous study, some ablation studies are conducted regarding the constituents of the VACE-WPE in an offline processing scenario. These studies reveal the characteristics of the system, thereby simplifying the architecture and leading to the introduction of new strategies for training the neural network for the VACE. Experimental results demonstrate that VACE-WPE (our PyTorch implementation and pre-trained models are available from <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/dreadbird06/vace_wpe</uri> ) considerably outperforms its single-channel counterpart in simulated noisy reverberant environments in terms of objective speech quality and is superior to the single-channel WPE as well as several fully neural speech dereverberation methods when employed as the front-end for the far-field automatic speech recognizer.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call