Modeling of Fundamental Frequency Contour of Thai Expressive Speech using Fujisaki's Model and Structural Model

Chomphan Chomphan

doi:10.3844/jcssp.2011.1310.1317

Abstract

Problem statement: In spontaneous speech communication, prosody is an important factor that must be taken into account, since the prosody effects on not only the naturalness but also the intelligibility of speech. Focusing on synthesis of Thai expressive speech, a number of systems has been developed for years. However, the expressive speech with various speaking styles has not been accomplished. To achieve the generation of expressive speech, we need to model the fundamental frequency (F0) contours accurately to preserve the speech prosody to preserve the quality of speech prosody. Approach: This study presents a comparison of two successful F0 models. One approach is based on the Fujisaki

Highlights

The Fujisaki’s model has been applied within a speaker-independent system as extended modules. It has been exploited in the modeling of Thai expressive speech; i.e., sad, happy, angry styles (Chomphan and Kobayashi, 2008; 2009)
Another study has been conducted by using a structural model which is based on the assumption that the behavioral characteristics of vocal-fold elongation in vibration could be approximated by those of a simple forced vibrating system (Ni and Hirose, 2006; Chomphan and Kobayashi, 2009)
The RMS error calculation has been done for evaluation the modeling performance for both mentioned speech models and for all speech styles including angry style, sad style, enjoyable style and

Summary

Introduction

The Fujisaki’s model has been applied within a speaker-independent system as extended modules It has been exploited in the modeling of Thai expressive speech; i.e., sad, happy, angry styles (Chomphan and Kobayashi, 2008; 2009). To find the optimal representative parameters, optimization is carried out by minimizing the mean squared error in the Ln F0(t) domain through the hillclimbing search in the space of model parameters (Seresangtakul and Takara, 2003). To use this model, the parameters are extracted from the speech database, utterance by utterance.

Objectives

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Computer Science	Publication Date: Aug 1, 2011
Citations: 9	License type: cc-by

R Discovery Prime

R Discovery Prime

Modeling of Fundamental Frequency Contour of Thai Expressive Speech using Fujisaki's Model and Structural Model

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer Science

Lead the way for us

Similar Papers

Analytical Study on Fundamental Frequency Contours of Thai Expressive Speech Using Fujisaki's Model
Chomphan
Journal of Computer Science | VOL. 6
Chomphan Chomphan
01 Jan 2009
Journal of Computer Science | VOL. 6

Fujisaki's Model of Fundamental Frequency Contours for Thai Dialects
Chomphan
Journal of Computer Science | VOL. 6
Chomphan Chomphan
01 Nov 2010
Journal of Computer Science | VOL. 6

Expressive Speech Recognition and Synthesis as Enabling Technologies for Affective Robot-Child Communication
Selma Yilmazyildiz ... Wesley Mattheyses
-
Selma Yilmazyildiz, et. al.Selma Yilmazyildiz ... Wesley Mattheyses
01 Jan 2006
01 Jan 2006

Dialogue act based expressive speech synthesis in limited domain for the Czech language
Martin Grůber ... Daniel Tihelka
Informatica | VOL. 44
Martin Grůber, et. al.Martin Grůber ... Daniel Tihelka
15 Jun 2020
Informatica | VOL. 44

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Modeling of Fundamental Frequency Contour of Thai Expressive Speech using Fujisaki's Model and Structural Model

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer Science