Speech recognition based on statistical models including multiple phonetic decision trees

Sayaka Shiota,Yoshihiko Nankaku,Keiichi Tokuda,Kei Hashimoto,Akinobu Lee,Heiga Zen

doi:10.1250/ast.32.236

Abstract

We propose a speech recognition technique using multiple model structures. In the use of context-dependent models, decision-tree-based context clustering is applied to find an appropriate parameter tying structure. However, context clustering is usually performed on the basis of unreliable statistics of hidden Markov model (HMM) state sequences because the estimation of reliable state sequences requires an appropriate model structures, that cannot be obtained prior to context clustering. Therefore, context clustering and the estimation of state sequences essentially cannot be performed independently. To overcome this problem, we propose an optimization technique of state sequences based on an annealing process using multiple decision trees. In this technique, a new likelihood function is defined in order to treat multiple model structures, and the deterministic annealing expectation maximization algorithm is used as the training algorithm. Experimental continuous phoneme recognition results show that the proposed method of using only two decision trees achieved about an 11.1% relative error reduction over the conventional method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Acoustical Science and Technology	Publication Date: Jan 1, 2011
Citations: 1	License type: free

R Discovery Prime

R Discovery Prime

Speech recognition based on statistical models including multiple phonetic decision trees

Abstract

Talk to us

Similar Papers

More From: Acoustical Science and Technology

Lead the way for us

Similar Papers

Acoustic modeling based on model structure annealing for speech recognition
Sayaka Shiota ... Kei Hashimoto
-
Sayaka Shiota, et. al.Sayaka Shiota ... Kei Hashimoto
22 Sep 2008
22 Sep 2008

A Bayesian Framework Using Multiple Model Structures for Speech Recognition
Sayaka Shiota ... Kei Hashimoto
IEICE Transactions on Information and Systems | VOL. E96.D
Sayaka Shiota, et. al.Sayaka Shiota ... Kei Hashimoto
01 Jan 2013
IEICE Transactions on Information and Systems | VOL. E96.D

Melanoma Diagnosis with Multiple Decision Trees
Yu Zhou ... Zhuoyi Song
-
Yu Zhou, et. al.Yu Zhou ... Zhuoyi Song
27 Sep 2013
27 Sep 2013

Context-dependent acoustic modeling based on hidden maximum entropy model for statistical parametric speech synthesis
Soheil Khorram ... Thomas Drugman
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2014
Soheil Khorram, et. al.Soheil Khorram ... Thomas Drugman
07 Apr 2014
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speech recognition based on statistical models including multiple phonetic decision trees

Abstract

Talk to us

Similar Papers

More From: Acoustical Science and Technology