Abstract

This paper presents the design and development of syllable specific unit selection cost functions for improving the quality of text-to-speech synthesis. Appropriate unit selection cost functions, namely concatenation cost and target cost, are proposed for syllable based synthesis. Concatenation costs are defined based on the type of segments present at the syllable joins. Proposed concatenation costs have shown significant reduction in perceptual discontinuity at syllable joins. Three-stage target cost formulation is proposed for selecting appropriate units from database. Subjective evaluation has shown improvement in the quality of speech at each stage.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call