Speech variability compensation for expressive speech synthesis

Jhing-Fa Wang Jhing-Fa Wang,Ta-Wen Kuan Ta-Wen Kuan,Yan-You Chen Yan-You Chen,Chia-Hao Chang Chia-Hao Chang,Chun-Yu Tsai Chun-Yu Tsai

doi:10.1109/icot.2013.6521194

Jhing-Fa Wang Jhing-Fa Wang, Ta-Wen Kuan Ta-Wen Kuan + Show 3 more

https://doi.org/10.1109/icot.2013.6521194

Copy DOI

Abstract

In conventional HMM-based speech synthesis, the algorithm for generating a high-quality reading style (neutral) speech has been well investigated. However, the human-like expressive speech synthesis is still rather far from practicability, which is caused by many factors. One of the influential factors is that the speech variability caused by speaker's arousal is rarely emphasized in speech synthesis. Accordingly, this paper proposed a novel speech synthesis method considering the speech variability. Two major advantages are highlighted by considering the speech variability. The first advantage is that the proposed method is capable of generating the time-variant human-like and expressive speech. The second one is to increase the diversity of expressive speech and to improve the drawback of traditional speech synthesis system with the monotonous characteristics of speech. The experimental result shows that the proposed method can improve the diversity capability of synthetic speech and successfully achieve the more expressive speech compare to conventional HTS one.

Full Text