Abstract

Many text-to-speech (TTS) systems under development in Europe and elsewhere — we discuss in particular the system under development at Edinburgh University's Centre for Speech Technology Research (CSTR) — generate intonational properties of synthetic utterances on the basis of an intermediate abstract phonological representation of prosodic features that is quite independent of any acoustic realisation. For evaluating certain aspects of synthetic prosody (notably accent placement and division into domains), this abstract representation is a more appropriate object of evaluation than the final acoustic output of a system, just as word stress and grapheme-to-phoneme conversion are appropriately evaluated in terms of symbol strings rather than acoustic output. By way of illustration we present the results of an evaluation exercise carried out on the sentence-accent assignment rules of the CSTR system, based on just such an abstract representation, which has been useful in improving our rules.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.