Abstract

An optimum coding of the parameters of a formant speech synthesizer is proposed. The optimisation of these control parameters coding is based on statistical and subjective criteria. The synthesizer used is a parallel synthesizer capable of synthesizing high quality speech whether voiced or unvoiced. The utterances chosen for experimentation are groups of high quality synthetic French CVCV which represent French speech faithfully. The first group of utterances consists of voiced stops (b, d and g with the vowels a, u and i) while the second group consists of voiced fricatives (З and z with the vowels a, u and i). The proposed procedure consists of four steps. The first one is a statistical study carried out on the control parameters of the synthetic utterances in order to find the optimum effective dynamic range of each parameter. During the second step, the minimum number of bits necessary for quantizing each parameter independently (with no noticeable degradation) is found. The third step finds the minimum number of bits when all the parameters are quantized simultaneously. This is done by regrouping the parameters in sub-groups and finding the minimum number of bits when applying the quantization of the parameters of the sub-group simultaneously. Then, we regroup the sub-groups until we find the optimum number of bits when all parameters are applied simultaneously. The final step is to find the optimum sampling rate of each interval of each utterance. A variable sampling interval is proposed that depends on the nature of speech events.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call