Abstract

The prosody model is one of the most important parts of every speech synthesizer, influencing mainly its naturalness. The intonation contour and phoneme lengths (together with speech quality) bear a great deal of extralinguistic and paralinguistic information contained in the synthesized speech. The features reflecting personality, mood and emotions of the speaker are in strong interaction with those reflecting speech styles. Anyway the appropriate choose of prosody model and training material can make it possible to create special model for every speaking style. The paper presents our approach to modelling of acoustic parameters of prosody in two different speech styles in Slovak. Our model is based on Classification and regression trees (CARTs). It uses independent CART for phoneme lengths and three CARTs for fundamental frequency (F0) at the beginning, centre, and end of every syllable. Two hours of read speech were used for training a model of read speech. The recordings of a puppet player were used to train a model of acted speech. The models were implemented in the Kempelen 2.2 unit selection Slovak speech synthesizer. The listening tests have shown that the models are capable of modelling significant amount of the differences of the two speaking styles.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.