Incorporating a mixed excitation model and postfilter into HMM-based text-to-speech synthesis

Takayoshi Yoshimura,Keiichi Tokuda,Tadashi Kitamura,Takashi Masuko,Takao Kobayashi

doi:10.1002/scj.20354

Abstract

In this paper we introduce a mixed excitation model into an HMM-based speech synthesis system with the objective of improving the quality of synthesized speech. In previous work we have proposed a text-to-speech synthesis system that synthesized speech by outputting speech parameters using HMMs that model Mel frequency ceptral coefficients, fundamental frequencies, and duration. In that system we used a simple model as the excitation source model for exciting the synthesis filter (an MLSA filter) whereby we switched between a pulse sequence and white noise for intervals of voiced and unvoiced speech, respectively. When using that type of excitation model, it is not possible to synthesize speech such as voiced fricatives that contains both a periodic and an aperiodic component, and this is a cause of poor synthesized speech quality. Therefore, in this paper we incorporate a mixed excitation model based on a narrowband vocoding method MELP that combines a pulse stream with white noise with a view to realizing high-quality speech synthesis. Since this excitation model can be applied to wideband vocoding as well as narrowband, we anticipate that it will prove effective for speech synthesis. In addition, we introduce a widely used vocoding method, a postfilter, in order to improve the quality of the synthesized speech. In addition, the results of subjective evaluation show the effectiveness of the mixed excitation model and postfilter in this system. © 2005 Wiley Periodicals, Inc. Syst Comp Jpn, 36(12): 43–50, 2005; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/scj.20354

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Incorporating a mixed excitation model and postfilter into HMM-based text-to-speech synthesis

Abstract

Talk to us

Similar Papers

More From: Systems and Computers in Japan

Lead the way for us

Journal: Systems and Computers in Japan	Publication Date: Jan 1, 2005
Citations: 21

Similar Papers

Improving Arabic HMM based speech synthesis quality
Ossama Abdel-Hamid ... Sherif Mahdy Abdou
-
Ossama Abdel-Hamid, et. al.Ossama Abdel-Hamid ... Sherif Mahdy Abdou
17 Sep 2006
17 Sep 2006

Deterministic annealing EM algorithm for developing TTS system in Gujarati
Nirmesh J Shah ... Maulik C Madhavi
-
Nirmesh J Shah, et. al.Nirmesh J Shah ... Maulik C Madhavi
01 Sep 2014
01 Sep 2014

An HMM-based speech synthesis system applied to English
...
-
, et. al. ...
01 Jan 2002
01 Jan 2002

Incorporating a mixed excitation model and postfilter into HMM-based text-to-speech synthesis
...
Systems and Computers in Japan | VOL. -
, et. al. ...
01 Nov 2005
Systems and Computers in Japan | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Incorporating a mixed excitation model and postfilter into HMM-based text-to-speech synthesis

Abstract

Talk to us

Similar Papers

More From: Systems and Computers in Japan