Factored Maximum Penalized Likelihood Kernel Regression for HMM-Based Style-Adaptive Speech Synthesis

June Sig Sung,Nam Soo Kim,Doo Hwa Hong

doi:10.1109/jstsp.2014.2305131

Abstract

Speech synthesized from the same text should sound differently depending on the speaking style. Current speech synthesis techniques based on the hidden Markov model (HMM) usually focus on a fixed speaking style and changing the speaking style requires a variety of sets of parameters trained in different speaking styles. A promising alternative is to adapt the base model to the intended speaking style. In our previous work, we proposed factored maximum likelihood linear regression (FMLLR) adaptation where each MLLR parameter is defined as a function of a control vector. We presented a method to train the FMLLR parameters based on a general framework of the expectation-maximization (EM) algorithm. In this paper, we introduce a novel technique called factored maximum penalized likelihood kernel regression (FMLKR) for HMM-based style adaptive speech synthesis. In FMLKR, nonlinear regression between the mean vector of the base model and the corresponding mean vectors of the adaptation data is performed with the use of kernel method based on the FMLLR framework. In a series of experiments on artificial generation of singing voice and expressive speech, we evaluate the performance of the FMLLR and FMLKR techniques with various matrix structures and also compare with other approaches to parameter adaptation in HMM-based speech synthesis.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Factored Maximum Penalized Likelihood Kernel Regression for HMM-Based Style-Adaptive Speech Synthesis

Abstract

Talk to us

Similar Papers

More From: IEEE Journal of Selected Topics in Signal Processing

Lead the way for us

Journal: IEEE Journal of Selected Topics in Signal Processing	Publication Date: Apr 1, 2014
Citations: 19

Similar Papers

HMM-based Finnish text-to-speech system utilizing glottal inverse filtering
Tuomo Raitio ... Martti Vainio
-
Tuomo Raitio, et. al.Tuomo Raitio ... Martti Vainio
22 Sep 2008
22 Sep 2008

Multi-speaker modeling with shared prior distributions and model structures for Bayesian speech synthesis
Kei Hashimoto ... Yoshihiko Nankaku
-
Kei Hashimoto, et. al.Kei Hashimoto ... Yoshihiko Nankaku
27 Aug 2011
27 Aug 2011

An HMM-Based Approach to Flexible Speech Synthesis
Keiichi Tokuda
-
Keiichi TokudaKeiichi Tokuda
01 Jan 2006
01 Jan 2006

인자화된 최대 공산선형회귀 적응기법을 적용한 해양IT융합기술을 위한 HMM기반 음성합성 시스템
June Sig Sung ... Nam Soo Kim
The Journal of Korea Information and Communications Society | VOL. 38C
June Sig Sung, et. al.June Sig Sung ... Nam Soo Kim
28 Feb 2013
The Journal of Korea Information and Communications Society | VOL. 38C

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Factored Maximum Penalized Likelihood Kernel Regression for HMM-Based Style-Adaptive Speech Synthesis

Abstract

Talk to us

Similar Papers

More From: IEEE Journal of Selected Topics in Signal Processing