Abstract

Voice conversion has been traditionally focused on spectrum. Current systems lack a solid prosody conversion method suitable for different speaking styles. Recently, the unit selection technique has been applied to transform emotional intonation contours. This paper goes one step beyond: it explores strategies for training and configuring the selection cost function in an emotion conversion application. The proposed system, which uses accent groups as basic intonation units and performs conversion also on phoneme durations and intensity, is evaluated by means of a carefully designed subjective test involving the big six emotions. Although the expressiveness of the converted sentences is still far from that of natural emotional speech, satisfactory results are obtained when different configurations are used for different emotions.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.