Abstract

In recent years, several systematic speech synthesis methods in Japanese, that are based on the pitch synchronous waveform superposition method (PSOLA), have been proposed. Generally, the phonemic or the syllabic unit is used as the synthesis unit in these methods. Since these methods require the phonemic concatenation in the phoneme boundary position, the superposition of noise becomes a common problem. To solve the noise problem, a method is proposed based on VCV synthesis units. VCV synthesis units are concatenated in the vowel steady section. To improve the VCV method which does not consider using the unvoiced vowel originally, unvoiced vowels as the exceptional segments are regarded, and included the segments in the waveform database. When the VCV unit is divided into more primitive parts, it is useful to take the spectrum distortion into consideration when choosing segments. The method has been evaluated with 50 synthesized voices, and it was found that there is a correlation between evaluated score and spectrum distortions.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call