Abstract

What can be done to improve the performance of text-to-speech systems, to meet the demands for Increase of intelligibility, naturalness, and flexibility in adopting to various speaker and voice characteristics, speaking style, message types, etc.? The basic problems lie less in our synthesis peripherals than in a knowledge gap concerning the speech code, but we are also limited by the constraints of present rule systems. Examples of minimally contrasting stop consonant features and some typical confusions are discussed. Segmental and prosodic features are more intimately connected than is usually recognized. Interaction effects are discussed. Special attention is devoted to voice source features and to stress correlates and the temporal organization of interstress intervals and pauses. The further development of synthesis should be guided by articulatory modelling of control functions, eventually leading to complete articulatory synthesis.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.