Pitch alteration technique in speech synthesis system

Jong-Soon Jung Jong-Soon Jung,Myung-Jin Bae Myung-Jin Bae,Jeong-Jin Kim Jeong-Jin Kim

doi:10.1109/30.920435

Jong-Soon Jung Jong-Soon Jung, Myung-Jin Bae Myung-Jin Bae + Show 1 more

https://doi.org/10.1109/30.920435

Copy DOI

Abstract

In the case of speech synthesis, the waveform coding method with its high quality is mainly used in synthesis by analysis. Because the parameters of this coding method are not classified as both the excitation and vocal tract parameters, it is difficult to apply the waveform coding method to synthesis by rule. Thus, in order to apply the waveform coding method to synthesis by rule, a pitch alteration is required for the prosody control. In the speech synthesis method by the conventional PSOLA (pitch synchronous overlap and add) technique, applying a symmetric window function to the asymmetric speech waveform, results in the energy unbalance phenomenon according to the degree of overlapped in the pitch interval adjustment. In this paper, to overcome the energy unbalance phenomenon, we proposed a new method that can convert the asymmetric waveform to a symmetric one by time-frequency conversion. As a result, we can obtain an average spectrum distortion ratio of 6.38% according to the pitch alteration ratio.

Full Text