Reliable Accent-Specific Unit Generation With Discriminative Dynamic Gaussian Mixture Selection for Multi-Accent Chinese Speech Recognition

Chao Zhang,Xuan Wang,Yi Liu,Yunqing Xia,Chin-Hui Lee

doi:10.1109/tasl.2013.2265087

Abstract

In this paper, we propose a discriminative dynamic Gaussian mixture selection (DGMS) strategy to generate reliable accent-specific units (ASUs) for multi-accent speech recognition. Time-aligned phone recognition is used to generate the ASUs that model accent variations explicitly and accurately. DGMS reconstructs and adjusts a pre-trained set of hidden Markov model (HMM) state densities to build dynamic observation densities for each input speech frame. A discriminative minimum classification error criterion is adopted to optimize the sizes of the HMM state observation densities with a genetic algorithm (GA). To the author's knowledge, the discriminative optimization for DGMS accomplishes discriminative training of discrete variables that is first proposed. We found the proposed framework is able to cover more multi-accent changes, thus reduce some performance loss in pruned beam search, without increasing the model size of the original acoustic model set. Evaluation on three typical Chinese accents, Chuan, Yue and Wu, shows that our approach outperforms traditional acoustic model reconstruction techniques with a syllable error rate reduction of 8.0%, 5.5% and 5.0%, respectively, while maintaining a good performance on standard Putonghua speech.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Reliable Accent-Specific Unit Generation With Discriminative Dynamic Gaussian Mixture Selection for Multi-Accent Chinese Speech Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE Transactions on Audio, Speech, and Language Processing	Publication Date: Oct 1, 2013
Citations: 8

Similar Papers

Discriminative dynamic Gaussian mixture selection with enhanced robustness and performance for multi-accent speech recognition
Chao Zhang ... Chin-Hui Lee
-
Chao Zhang, et. al.Chao Zhang ... Chin-Hui Lee
01 Mar 2012
01 Mar 2012

Compensation of SNR and noise type mismatch using an environmental sniffing based speech recognition solution
Yongjoo Chung ... John Hl Hansen
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2013
Yongjoo Chung, et. al.Yongjoo Chung ... John Hl Hansen
20 Jun 2013
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2013

A General Approximation-Optimization Approach to Large Margin Estimation of HMMs
Hui Jiang ... Xinwei Li
-
Hui Jiang, et. al.Hui Jiang ... Xinwei Li
01 Jun 2007
01 Jun 2007

Sentence‐HMM state‐based i‐vector/PLDA modelling for improved performance in text dependent single utterance speaker verification
Osman Büyük
IET Signal Processing | VOL. 10
Osman BüyükOsman Büyük
01 Oct 2016
IET Signal Processing | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Reliable Accent-Specific Unit Generation With Discriminative Dynamic Gaussian Mixture Selection for Multi-Accent Chinese Speech Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing