Gaussian mixture models for adaptation of deep neural network acoustic models in automatic speech recognition systems

N.A Tomashenko,A Larcher,Ya Estève,Yu.Yu Khokhlov,Yu.N Matveev

doi:10.17586/2226-1494-2016-16-6-1063-1072

Abstract

Subject of Research. We study speaker adaptation of deep neural network (DNN) acoustic models in automatic speech recognition systems. The aim of speaker adaptation techniques is to improve the accuracy of the speech recognition system for a particular speaker. Method. A novel method for training and adaptation of deep neural network acoustic models has been developed. It is based on using an auxiliary GMM (Gaussian Mixture Models) model and GMMD (GMM-derived) features. The principle advantage of the proposed GMMD features is the possibility of performing the adaptation of a DNN through the adaptation of the auxiliary GMM. In the proposed approach any methods for the adaptation of the auxiliary GMM can be used, hence, it provides a universal method for transferring adaptation algorithms developed for GMMs to DNN adaptation.Main Results. The effectiveness of the proposed approach was shown by means of one of the most common adaptation algorithms for GMM models – MAP (Maximum A Posteriori) adaptation. Different ways of integration of the proposed approach into state-of-the-art DNN architecture have been proposed and explored. Analysis of choosing the type of the auxiliary GMM model is given. Experimental results on the TED-LIUM corpus demonstrate that, in an unsupervised adaptation mode, the proposed adaptation technique can provide, approximately, a 11–18% relative word error reduction (WER) on different adaptation sets, compared to the speaker-independent DNN system built on conventional features, and a 3–6% relative WER reduction compared to the SAT-DNN trained on fMLLR adapted features.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific and Technical Journal of Information Technologies, Mechanics and Optics	Publication Date: Nov 15, 2016
Citations: 1	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Gaussian mixture models for adaptation of deep neural network acoustic models in automatic speech recognition systems

Abstract

Talk to us

Similar Papers

More From: Scientific and Technical Journal of Information Technologies, Mechanics and Optics

Lead the way for us

Similar Papers

Towards speaker adaptive training of deep neural network acoustic models
Yajie Miao ... Florian Metze
-
Yajie Miao, et. al.Yajie Miao ... Florian Metze
14 Sep 2014
14 Sep 2014

Multi-task deep neural network acoustic models with model adaptation using discriminative speaker identity for whisper recognition
Jingjie Li ... Si Wei
-
Jingjie Li, et. al.Jingjie Li ... Si Wei
01 Apr 2015
01 Apr 2015

Low-rank bases for factorized hidden layer adaptation of DNN acoustic models
Lahiru Samarakoon ... Khe Chai Sim
-
Lahiru Samarakoon, et. al.Lahiru Samarakoon ... Khe Chai Sim
01 Dec 2016
01 Dec 2016

Speaker Adaptive Training of Deep Neural Network Acoustic Models Using I-Vectors
Yajie Miao ... Florian Metze
IEEE/ACM transactions on audio, speech, and language processing | VOL. 23
Yajie Miao, et. al.Yajie Miao ... Florian Metze
01 Nov 2015
IEEE/ACM transactions on audio, speech, and language processing | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Gaussian mixture models for adaptation of deep neural network acoustic models in automatic speech recognition systems

Abstract

Talk to us

Similar Papers

More From: Scientific and Technical Journal of Information Technologies, Mechanics and Optics