An overview of nitech HMM-based speech synthesis system for blizzard challenge 2005

Heiga Zen,Tomoki Toda

doi:10.21437/interspeech.2005-76

An overview of nitech HMM-based speech synthesis system for blizzard challenge 2005

Heiga Zen, Tomoki Toda

https://doi.org/10.21437/interspeech.2005-76

Copy DOI

Publication Date: Sep 4, 2005

Citations: 72

#HMM-based Speech Synthesis System #Real Time Ratio + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

In the present paper, hidden Markov model (HMM) based speech synthesis system developed in Nagoya Institute of Technology (Nitech-HTS) for a competition of text-to-speech synthesis systems using the same speech databases, named Blizzard Challenge 2005, is described. We show an overview of the basic HMM-based speech synthesis system and then recent developments to the latest one such as STRAIGHT-based vocoding, hidden semi-Markov model (HSMM) based acoustic modeling, and parameter generation considering global variance are illustrated. Constructed voices can synthesize speech around 0.3 xRT (real time ratio) and their footprints are less than 2 MB. The listening test results show that performances of our systems are much better than we expected.

Full Text