Abstract
The harmonics plus noise model (HNM) has been used for prosodic speech signal modifications in high-quality environments in recent decades. Such speech modification techniques allow Text-To-Speech systems to generate more expressive synthesis without requiring extensive corpora resources. A more expressive synthesis can improve the user experience with Human–Machine-Interfaces. In this paper, an adaptation of the adaptive pre-emphasis linear prediction technique to the HNM for modifying vocal effort is presented. The proposed transformation methodology is validated using a copy re-synthesis strategy on a speech corpora specifically designed for vocal effort research. The perceptual tests demonstrate the effectiveness of the proposed technique in performing various types of vocal effort conversions for the given corpus.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have