Acoustic modeling based on model structure annealing for speech recognition

Sayaka Shiota,Yoshihiko Nankaku,Heiga Zen,Akinobu Lee,Keiichi Tokuda,Kei Hashimoto

doi:10.21437/interspeech.2008-111

Abstract

This paper proposes an HMM training technique using multiple phonetic decision trees and evaluates it in speech recognition. In the use of context dependent models, the decision tree based context clustering is applied to find a parameter tying structure. However, the clustering is usually performed based on statistics of HMM state sequences which are obtained by unreliable models without context clustering. To avoid this problem, we optimize the decision trees and HMM state sequences simultaneously. In the proposed method, this is performed by maximum likelihood (ML) estimation of a newly defined statistical model which includes multiple decision trees as hidden variables. Applying the deterministic annealing expectation maximization (DAEM) algorithm and using multiple decision trees in early stage of model training, state sequences are reliably estimated. In continuous phoneme recognition experiments, the proposed method can improve the recognition performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Acoustic modeling based on model structure annealing for speech recognition

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Speech recognition based on statistical models including multiple phonetic decision trees
Sayaka Shiota ... Heiga Zen
Acoustical Science and Technology | VOL. 32
Sayaka Shiota, et. al.Sayaka Shiota ... Heiga Zen
01 Jan 2010
Acoustical Science and Technology | VOL. 32

A Bayesian Framework Using Multiple Model Structures for Speech Recognition
Sayaka Shiota ... Kei Hashimoto
IEICE Transactions on Information and Systems | VOL. E96.D
Sayaka Shiota, et. al.Sayaka Shiota ... Kei Hashimoto
01 Jan 2013
IEICE Transactions on Information and Systems | VOL. E96.D

Melanoma Diagnosis with Multiple Decision Trees
Yu Zhou ... Zhuoyi Song
-
Yu Zhou, et. al.Yu Zhou ... Zhuoyi Song
27 Sep 2013
27 Sep 2013

Context-dependent acoustic modeling based on hidden maximum entropy model for statistical parametric speech synthesis
Soheil Khorram ... Thomas Drugman
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2014
Soheil Khorram, et. al.Soheil Khorram ... Thomas Drugman
07 Apr 2014
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Acoustic modeling based on model structure annealing for speech recognition

Abstract

Talk to us

Similar Papers