Abstract
The residual‐excited LPC is one of the most effective techniques for producing high‐quality synthetic speech. However, this technique has difficulty controlling pitch frequency when applied to realizing synthesized speech with a pitch contour different from original speech or when applied to synthesis of arbitrary speech created with concatenated spoken units. In previous methods, pitch‐period LPC residual waveforms that extracted pitch synchronously have been used to control pitch frequency. However, the extraction position (Eρ) and window length (Ew) are very critical to synthesized speech quality, and cause voice quality deterioration. This paper proposes a new method of excitation waveform extraction that automatically determines Eρ and Ew using spectral envelope distortion criteria between input and synthetic speech. Subjective evaluation experiments indicate that the pitch frequency pattern can be changed with a relatively small deterioration in quality. Application of this method to arbitrary speech synthesis will also be discussed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.