MBR-PSOLA: Text-To-Speech synthesis based on an MBE re-synthesis of the segments database

T Dutoit,H Leich

doi:10.1016/0167-6393(93)90042-j

Abstract

The use of the Time-Domain Pitch Synchronous OverLap-Add (TD-PSOLA) algorithm in a Text-To-Speech synthesizer is reviewed. Its drawbacks are underlined and three conditions on the speech database are examined. In order to satisfy them, a previously described high quality resynthesis process is developed and enhanced, which makes use of the well-known Multi-Band Excited (MBE) model. An important by-product of this operation is that optimal Pitch Marking turns out to be automatic. A temporal interpolation block is finally added. The resulting Multi-Band Resynthesis Pitch Synchronous OverLap Add (MBR-PSOLA) synthesis algorithm supports spectral interpolation between voiced parts of segments, with virtually no increase in complexity. It provides the basis of a high-quality Text-To-Speech (TTS) synthesizer.

Full Text