Abstract

Problem statement: In the conventional HMM-based speech synthesis system for Thai, there is no control of fundamental frequency control in the synthesis stage. The tone correctness of the synthesized speech is unacceptable due to the imbalance of training data of all tones. Approach: This study proposes a mathematical model to control the F0 contour of the synthesized speech. This control is proposed to correct only some distorted segments of the F0 contour which occur within some syllables due to lacking of training data for some tones. Results: An experimental result compares F0 contours between those of synthesized speech with and without tone-type questions; furthermore the size of Thai speech corpus is varied to investigate the synthesized speech quality. A mathematical model is applied to control the F0 contour. By using the proposed control, the correction of the F0 contour is obviously shown in the experimental results. Conclusion: The control of F0 contour has been proposed. It can noticeably improve the tone correctness of the synthesized speech.

Highlights

  • In the development of Thai speech synthesis, a TTS synthesis system based on unit selection is initially

  • A TTS synthesis system based on unit selection with TD-PSOLA technique is developed by National Electronics and Computers Technology Center (NECTEC) in 2003 (Hansakunbuntheung et al, 2005)

  • Since Thai is a tonal language, this study is proposed to implement Thai speech synthesis based on HMM which has the ability of synthesizing speech with various voice characteristics and various speaking styles, an additional control of fundamental frequency contour of the synthetic speech is proposed

Read more

Summary

Introduction

In the development of Thai speech synthesis, a TTS synthesis system based on unit selection is initiallySpeech synthesis is an important technology for realizing natural human-computer interaction. In the development of Thai speech synthesis, a TTS synthesis system based on unit selection is initially HMM-based TTS system in which each speech synthesis unit is modeled by HMM is proposed in the recent years (Masuko et al, 1996; Yoshimura et al, 1999).

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call