A study on Tibetan prosodic model of speech and respiratory signals

Qi Chen,Chen Chen,Jing Shi,Hongzhi Yu

doi:10.1109/icinfa.2010.5512460

Abstract

Prosodic model is an important component of the TTS system, and respiratory rhythm is an important factor affecting prosodic features. Based on the speech characteristics of Tibetan, the paper studies the correspondence between respiratory signals and Tibetan prosodic features, and has decided parameters of speech and respiratory signals that affect parameters of prosodic features. Combining the research experience of Chinese prosodic models, the paper established two Tibetan prosodic models with RBF neural network - speech prosodic model and prosodic model of speech and respiratory signals, so physiological signals has been introduced into the establishment of prosodic model. News corpus is used for training of these two kinds of prosodic models with a comparing test the output, the result of which shows that the prosodic model of speech and respiratory signals can generate fundamental frequency and duration parameters that is nearer to natural speech. The results of listening and phonetically identification test show that the MOS score of its synthesized speech is 3.37, with a high naturalness.

Full Text