Unsupervised Intralingual and Cross-Lingual Speaker Adaptation for HMM-Based Speech Synthesis Using Two-Pass Decision Tree Construction

Matthew Gibson,William Byrne

doi:10.1109/tasl.2010.2066968

Abstract

Hidden Markov model (HMM)-based speech synthesis systems possess several advantages over concatenative synthesis systems. One such advantage is the relative ease with which HMM-based systems are adapted to speakers not present in the training dataset. Speaker adaptation methods used in the field of HMM-based automatic speech recognition (ASR) are adopted for this task. In the case of unsupervised speaker adaptation, previous work has used a supplementary set of acoustic models to estimate the transcription of the adaptation data. This paper first presents an approach to the unsupervised speaker adaptation task for HMM-based speech synthesis models which avoids the need for such supplementary acoustic models. This is achieved by defining a mapping between HMM-based synthesis models and ASR-style models, via a two-pass decision tree construction process. Second, it is shown that this mapping also enables unsupervised adaptation of HMM-based speech synthesis models without the need to perform linguistic analysis of the estimated transcription of the adaptation data. Third, this paper demonstrates how this technique lends itself to the task of unsupervised cross-lingual adaptation of HMM-based speech synthesis models, and explains the advantages of such an approach. Finally, listener evaluations reveal that the proposed unsupervised adaptation methods deliver performance approaching that of supervised adaptation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Audio, Speech, and Language Processing	Publication Date: May 1, 2011
Citations: 35	License type: mit

R Discovery Prime

R Discovery Prime

Unsupervised Intralingual and Cross-Lingual Speaker Adaptation for HMM-Based Speech Synthesis Using Two-Pass Decision Tree Construction

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing

Lead the way for us

Similar Papers

Two-pass decision tree construction for unsupervised adaptation of HMM-based synthesis models
Matthew Gibson
-
Matthew GibsonMatthew Gibson
06 Sep 2009
06 Sep 2009

Unsupervised speaker adaptation for robust speech recognition in real environments
Shingo Yamade ... Hiroshi Saruwatari
-
Shingo Yamade, et. al.Shingo Yamade ... Hiroshi Saruwatari
01 Jan 2004
01 Jan 2004

Some Aspects of ASR Transcription Based Unsupervised Speaker Adaptation for HMM Speech Synthesis
Bálint Tóth ... Géza Németh
-
Bálint Tóth, et. al.Bálint Tóth ... Géza Németh
01 Jan 2009
01 Jan 2009

Modeling of Speech Parameter Sequence Considering Global Variance for HMM-Based Speech Synthesis
Tomoki Toda
-
Tomoki TodaTomoki Toda
19 Apr 2011
19 Apr 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Unsupervised Intralingual and Cross-Lingual Speaker Adaptation for HMM-Based Speech Synthesis Using Two-Pass Decision Tree Construction

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing