Deep neural network-based generalized sidelobe canceller for dual-channel far-field speech recognition

Guanjun Li,Shan Liang,Shuai Nie,Wenju Liu,Zhanlei Yang

doi:10.1016/j.neunet.2021.04.017

Abstract

The traditional generalized sidelobe canceller (GSC) is a common speech enhancement front end to improve the noise robustness of automatic speech recognition (ASR) systems in the far-field cases. However, the traditional GSC is optimized based on the signal level criteria, causing it not to guarantee the optimal ASR performance. To address this issue, we propose a novel dual-channel deep neural network (DNN)-based GSC structure, called nnGSC, which is optimized by using the objective of maximizing the ASR performance. Our key idea is to make each module of the traditional GSC fully learnable and use the acoustic model to perform joint optimization with GSC. We use the coefficients of the traditional GSC to initialize nnGSC, so that both traditional signal processing knowledge and large amounts of data can be used to guide the network learning. In addition, nnGSC can automatically track the target direction-of-arrival (DOA) frame-by-frame without the need for additional localization algorithms. In the experiments, nnGSC achieves a relative character error rate (CER) improvement of 23.7% compared to the microphone observation, 13.5% compared to the oracle direction-based super-directive beamformer, 12.2% compared to the oracle direction-based traditional GSC and 5.9% compared to the oracle mask-based minimum variance distortionless response (MVDR) beamformer. Moreover, we can improve the robustness of nnGSC against array geometry mismatches by training with multi-geometry data.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Deep neural network-based generalized sidelobe canceller for dual-channel far-field speech recognition

Abstract

Talk to us

Similar Papers

More From: Neural networks : the official journal of the International Neural Network Society

Lead the way for us

Journal: Neural networks : the official journal of the International Neural Network Society	Publication Date: Apr 19, 2021
Citations: 22

Similar Papers

Developing children's ASR system under low-resource conditions using end-to-end architecture
Ankita ... S Shahnawazuddin
Digital signal processing | VOL. 146
Ankita, et. al. Ankita ... S Shahnawazuddin
08 Jan 2024
Digital signal processing | VOL. 146

Combined speech enhancement and auditory modelling for robust distributed speech recognition
Ronan Flynn ... Edward Jones
Speech Communication | VOL. 50
Ronan Flynn, et. al.Ronan Flynn ... Edward Jones
20 May 2008
Speech Communication | VOL. 50

A new GSC based MVDR beamformer with CSLMS algorithm for adaptive weights optimization
Wei Shao ... Wei-Cheng Wang
-
Wei Shao, et. al.Wei Shao ... Wei-Cheng Wang
01 Oct 2011
01 Oct 2011

On time-frequency mask estimation for MVDR beamforming with application in robust speech recognition
Xiong Xiao ... Douglas L Jones
-
Xiong Xiao, et. al.Xiong Xiao ... Douglas L Jones
01 Mar 2017
01 Mar 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deep neural network-based generalized sidelobe canceller for dual-channel far-field speech recognition

Abstract

Talk to us

Similar Papers

More From: Neural networks : the official journal of the International Neural Network Society