Abstract

Three prosodic rules are proposed to realize Japanese text‐to‐speech conversion system. The first prosodic rule groups words into accent phrases and breath groups, using both the grammatical relationship among adjacent words or phrases by local dependency analysis and the restriction of breath group length. The second prosodic rule decides the accent position in each accent phrase, using accent attributes of constituent words. Accentuation rules are applied to constituent words in an accent phrase according to its structure. These two kinds of information, accent (or breath) group boundaries and accent positions, are utilized to generate a sentence fundamental frequency pattern. The third segmental duration control rule decides segment temporal patterns, which are used to lengthen or to shorten prestored syllabic multiphone units such as CVC and CV segments. These prosodic rules enable one to produce speech waveform with natural prosody using ECL's speech synthesis system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call