Abstract

Prosodic model is an important component of the TTS system, and respiratory rhythm is an important factor affecting prosodic features. Based on the speech characteristics of Tibetan, the paper studies the correspondence between respiratory signals and Tibetan prosodic features, and has decided parameters of speech and respiratory signals that affect parameters of prosodic features. Combining the research experience of Chinese prosodic models, the paper established two Tibetan prosodic models with RBF neural network - speech prosodic model and prosodic model of speech and respiratory signals, so physiological signals has been introduced into the establishment of prosodic model. News corpus is used for training of these two kinds of prosodic models with a comparing test the output, the result of which shows that the prosodic model of speech and respiratory signals can generate fundamental frequency and duration parameters that is nearer to natural speech. The results of listening and phonetically identification test show that the MOS score of its synthesized speech is 3.37, with a high naturalness.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.