Abstract

Generation process model of fundamental frequency contours is ideal to represent global features of prosody. It is a command response model, where the commands have clear relations with linguistic and para/non linguistic information conveyed by the utterance. Therefore, by handling fundamental frequency contours in the framework of the generation process model, prosody control with increased flexibility comes possible in speech synthesis. Also, the model can be used to solve problems of HMM-based speech synthesis, which arise from frame-by-frame treatment of fundamental frequencies. Two ways are possible; before training and after generation processes. The former is to suppress unnatural fundamental frequency movements of speech for HMM training, and the latter is to reshape the fundamental frequency contours, generated by HMM-based speech synthesis. A method of prosody conversion is also developed, which views the model command differences between original and target styles. The method enables flexible control of fundamental frequency contours in speech synthesis.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.