Speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS

Po-Chun Wang,I-Bin Liao,Yih-Ru Wang,Sin-Horng Chen,Chen-Yu Chiang

doi:10.1109/iscslp.2014.6936616

Abstract

In this paper, a speaker adaptation method to adapt an existing speaking rate-dependent hierarchical prosodic model (SR-HPM) of an SR-controlled Mandarin TTS system to new speaker's data for realizing a new voice is proposed. Two main problems are addressed: data sparseness for few adaptation utterances existing only in a small range of normal speaking rate and no adaptation data in both ranges of fast and slow speaking rates. The proposed method follows the idea of SR-HPM training to firstly normalize the prosodic-acoustic features of the new speaker's speech data, to then train an HPM by the prosody labeling and modeling algorithm, and to lastly refine the HPM to an SR-dependent model. The MAP adaptation method with model parameter extrapolation is applied to cope with the above two problems. Experimental results on a male speaker's adaptation data confirmed that the resulting adaptive SR-HPM has reasonable parameters covering a wide range of speaking rates and hence can be used in the TTS system to generate prosodic-acoustic features for synthesizing the new speaker's voice of any given SR.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Speaker and style adaptation using average voice model for style control in HMM-based speech synthesis
Makoto Tachibana ... Takao Kobayashi
-
Makoto Tachibana, et. al. Makoto Tachibana ... Takao Kobayashi
01 Mar 2008
01 Mar 2008

Speaker Adaptation of SR-HPM for Speaking Rate-Controlled Mandarin TTS
I-Bin Liao ... Sin-Horng Chen
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 24
I-Bin Liao, et. al.I-Bin Liao ... Sin-Horng Chen
01 Nov 2016
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 24

Structural maximum a posteriori speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS
I-Bin Liao ... Chen-Yu Chiang
-
I-Bin Liao, et. al.I-Bin Liao ... Chen-Yu Chiang
01 Mar 2016
01 Mar 2016

An investigation of multi-speaker training for wavenet vocoder
Tomoki Hayashi ... Kazuhiro Kobayashi
-
Tomoki Hayashi, et. al.Tomoki Hayashi ... Kazuhiro Kobayashi
01 Dec 2017
01 Dec 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS

Abstract

Talk to us

Similar Papers