Emotional speech synthesis: from speech database to TTS

Juan Manuel Montero,Santiago Aguilera,Juana M Gutierrez-Arriola,Emilia Enriquez,Sira Palazuelos,José Manuel Pardo

doi:10.21437/icslp.1998-147

Abstract

Modern Speech synthesisers have achieved a high degree of intelligibility, but can not be regarded as natural-sounding devices. In order to decrease the monotony of synthetic speech, the implementation of emotional effects is now being progressively considered. This paper presents a through study of emotional speech in Spanish, and its application to TTS, presenting a prototype system that simulates emotional speech using a commercial synthesiser. The design and recording of a Spanish database will be described and also the analysis of the emotional prosody (by fitting the data to a formal model). Using this collected data, a rule-based simulation of three primary emotions was implemented in the Text-to-Speech system. Finally, the assessment of the synthetic voice through perception experiments will classify the system as capable of producing quality voice with recognisable emotional effects.

Full Text