Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement

Hassan Taherian,Jorge Chang,Zhong-Qiu Wang,Deliang Wang

doi:10.1109/taslp.2020.2986896

Abstract

Deep neural network (DNN) embeddings for speaker recognition have recently attracted much attention. Compared to i-vectors, they are more robust to noise and room reverberation as DNNs leverage large-scale training. This article addresses the question of whether speech enhancement approaches are still useful when DNN embeddings are used for speaker recognition. We investigate single- and multi-channel speech enhancement for text-independent speaker verification based on x-vectors in conditions where strong diffuse noise and reverberation are both present. Single-channel (monaural) speech enhancement is based on complex spectral mapping and is applied to individual microphones. We use masking-based minimum variance distortion-less response (MVDR) beamformer and its rank-1 approximation for multi-channel speech enhancement. We propose a novel method of deriving time-frequency masks from the estimated complex spectrogram. In addition, we investigate gammatone frequency cepstral coefficients (GFCCs) as robust speaker features. Systematic evaluations and comparisons on the NIST SRE 2010 retransmitted corpus show that both monaural and multi-channel speech enhancement significantly outperform x-vector's performance, and our covariance matrix estimate is effective for the MVDR beamformer.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE/ACM Transactions on Audio, Speech, and Language Processing	Publication Date: Jan 1, 2020
Citations: 86

Similar Papers

Multi-channel Speech Enhancement Based on the MVDR Beamformer and Postfilter
Dujuan Wang ... Changchun Bao
-
Dujuan Wang, et. al.Dujuan Wang ... Changchun Bao
21 Aug 2020
21 Aug 2020

Complex Spectral Mapping for Single- and Multi-Channel Speech Enhancement and Robust ASR.
Zhong-Qiu Wang ... Peidong Wang
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 28
Zhong-Qiu Wang, et. al.Zhong-Qiu Wang ... Peidong Wang
01 Jan 2020
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 28

Deep Neural Network-based Speech Separation Combining with MVDR Beamformer for Automatic Speech Recognition System
Bong-Ki Lee ... Jaewoong Jeong
-
Bong-Ki Lee, et. al.Bong-Ki Lee ... Jaewoong Jeong
01 Jan 2019
01 Jan 2019

Exploring multi-channel features for denoising-autoencoder-based speech enhancement
Shoko Araki ... Tomoki Hayashi
-
Shoko Araki, et. al.Shoko Araki ... Tomoki Hayashi
01 Apr 2015
01 Apr 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing