Abstract

In the case of speech synthesis, the waveform coding method with its high quality is mainly used in synthesis by analysis. Because the parameters of this coding method are not classified as both the excitation and vocal tract parameters, it is difficult to apply the waveform coding method to synthesis by rule. Thus, in order to apply the waveform coding method to synthesis by rule, a pitch alteration is required for the prosody control. In the speech synthesis method by the conventional PSOLA (pitch synchronous overlap and add) technique, applying a symmetric window function to the asymmetric speech waveform, results in the energy unbalance phenomenon according to the degree of overlapped in the pitch interval adjustment. In this paper, to overcome the energy unbalance phenomenon, we proposed a new method that can convert the asymmetric waveform to a symmetric one by time-frequency conversion. As a result, we can obtain an average spectrum distortion ratio of 6.38% according to the pitch alteration ratio.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call