Speech synthesis software with a variable speaking rate and its implementation on a 32-bit microprocessor

T Ebihara,Y Kisuki,T Hase,T Sakamoto,Y Ishikawa

doi:10.1109/30.883468

Abstract

This paper describes a new speech synthesis system that produces speech at a controllable rate. The method is based on the oscillator model in which output speech of a desired length can be obtained without extracting pitch synchronous positions. This model has been applied to a residual-excited vocoder to improve the sound quality of synthesized speech. The proposed method is based on the duration of phonemes in natural speech. Phonemes are classified for the time-scale modification algorithm, and this made it possible to easily control the duration of phonemes various speaking rates. Sound quality evaluation tests confirmed that the quality of sound produced by this new method is better than that produced by existing methods. The method was verified by implementing it as real-time synthesis software. The software required 8 MIPS of CPU power and ran on a 32-bit microprocessor.

Full Text