Abstract

This paper presents a method to realize HMM-based Tibetan speech synthesis using a Mandarin speech synthesis framework. A Mandarin context-dependent label format is adopted to label Tibetan sentences. A Mandarin question set is also extended for Tibetan by adding language-specific questions. A Mandarin speech synthesis framework is utilized to train an average mixed-lingual model from a large Mandarin multi-speaker-based corpus and a small Tibetan one-speaker-based corpus using the speaker adaptive training. Then the speaker adaptation transformation is applied to the average mixed-lingual model to obtain a speaker adapted Tibetan model. Experimental results show that this method outperforms the method using speaker dependent Tibetan model when only a small amount of training Tibetan utterances are available. When the number of training Tibetan utterances is increased, the performances of the two methods tend to be the same.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call