Speaker Adaptation and Adaptive Training for Jointly Optimised Tandem Systems

Yu Wang,Philip Woodland,Mark Gales,Chao Zhang

doi:10.21437/interspeech.2018-2432

Abstract

© 2018 International Speech Communication Association. All rights reserved. Speaker independent (SI) Tandem systems trained by joint optimisation of bottleneck (BN) deep neural networks (DNNs) and Gaussian mixture models (GMMs) have been found to produce similar word error rates (WERs) to Hybrid DNN systems. A key advantage of using GMMs is that existing speaker adaptation methods, such as maximum likelihood linear regression (MLLR), can be used which to account for diverse speaker variations and improve system robustness. This paper investigates speaker adaptation and adaptive training (SAT) schemes for jointly optimised Tandem systems. Adaptation techniques investigated include constrained MLLR (CMLLR) transforms based on BN features for SAT as well as MLLR and parameterised sigmoid functions for unsupervised test-time adaptation. Experiments using English multi-genre broadcast (MGB3) data show that CMLLR SAT yields a 4% relative WER reduction over jointly trained Tandem and Hybrid SI systems, and further reductions in WER are obtained by system combination.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Speaker Adaptation and Adaptive Training for Jointly Optimised Tandem Systems

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Linear Regression Based Acoustic Adaptation for the Subspace Gaussian Mixture Model
Sina Hamidi Ghalehjegh ... Richard C Rose
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 22
Sina Hamidi Ghalehjegh, et. al.Sina Hamidi Ghalehjegh ... Richard C Rose
01 Sep 2014
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 22

Speaker adaptive joint training of Gaussian mixture models and bottleneck features
Zoltan Tuske ... Pavel Golik
-
Zoltan Tuske, et. al.Zoltan Tuske ... Pavel Golik
01 Dec 2015
01 Dec 2015

Rapid speaker adaptation in latent speaker space with non-negative matrix factorization
Xueru Zhang ... Hugo Van Hamme
Speech Communication | VOL. 55
Xueru Zhang, et. al.Xueru Zhang ... Hugo Van Hamme
16 May 2013
Speech Communication | VOL. 55

Investigation of unsupervised adaptation of DNN acoustic models with filter bank input
Takuya Yoshioka ... Mark J F Gales
-
Takuya Yoshioka, et. al.Takuya Yoshioka ... Mark J F Gales
01 May 2014
01 May 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speaker Adaptation and Adaptive Training for Jointly Optimised Tandem Systems

Abstract

Talk to us

Similar Papers