Speaker Adaptive Training using Deep Neural Networks

Tsubasa Ochiai,Shigeru Katagiri,Shigeki Matsuda,Xugang Lu,Chiori Hori

doi:10.1109/icassp.2014.6854826

Abstract

Among many speaker adaptation embodiments, Speaker Adaptive Training (SAT) has been successfully applied to a standard Hidden-Markov-Model (HMM) speech recognizer, whose state is associated with Gaussian Mixture Models (GMMs). On the other hand, recent studies on Speaker-Independent (SI) recognizer development have reported that a new type of HMM speech recognizer, which replaces GMMs with Deep Neural Networks (DNNs), outperforms GMM-HMM recognizers. Along these two lines, it is natural to conceive of further improvement to a preset DNN-HMM recognizer by employing SAT. In this paper, we propose a novel training scheme that applies SAT to a SI DNN-HMM recognizer. We then implement the SAT scheme by allocating a Speaker-Dependent (SD) module to one of the intermediate layers of a seven-layer DNN, and elaborate its utility over TED Talks corpus data. Experiment results show that our speaker-adapted SAT-based DNN-HMM recognizer reduces the word error rate by 8.4% more than that of a baseline SI DNN-HMM recognizer, and (regardless of the SD module allocation) outperforms the conventional speaker adaptation scheme. The results also show that the inner layers of DNN are more suitable for the SD module than the outer layers.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Speaker Adaptive Training using Deep Neural Networks

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Speaker adaptive training for deep neural networks embedding linear transformation networks
Tsubasa Ochiai ... Shigeki Matsuda
-
Tsubasa Ochiai, et. al.Tsubasa Ochiai ... Shigeki Matsuda
01 Apr 2015
01 Apr 2015

Fast Adaptation of Deep Neural Network Based on Discriminant Codes for Speech Recognition
Shaofei Xue ... Qingfeng Liu
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 22
Shaofei Xue, et. al. Shaofei Xue ... Qingfeng Liu
01 Dec 2014
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 22

Speaker Adaptation and Adaptive Training for Jointly Optimised Tandem Systems
Yu Wang ... Chao Zhang
-
Yu Wang, et. al.Yu Wang ... Chao Zhang
02 Sep 2018
02 Sep 2018

Speaker Adaptive Training of Deep Neural Network Acoustic Models Using I-Vectors
Yajie Miao ... Hao Zhang
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 23
Yajie Miao, et. al.Yajie Miao ... Hao Zhang
01 Nov 2015
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speaker Adaptive Training using Deep Neural Networks

Abstract

Talk to us

Similar Papers