Abstract

paper presents a study on prosody modeling for speech synthesis. Any Text to Speech system comprises of two phases. One is text analysis and second is speech synthesis. The task of text analysis is to find the words and the task of speech synthesis is to generate the speech. To attain this, different models are available such as text as language models, grapheme to phoneme models, full linguistic analysis model and complete prosody generation model. In complete prosody generation model, the quantities like phrasing, stress and the like are determined to generate naturalness bearing synthetic voice. Towards generating such a speech, an explicit prosodic model is required. This makes the speech more understandable. Many researches have been done in this stream, but still better solution is required. In this paper, the strength and weaknesses of different approaches of prosody models are discussed.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call