Abstract

Several quantization schemes for eigenparameters of speech have been studied as part of an effort to determine a set of spectral patterns that will be adequate to synthesize speech having the characteristics of, a particular talker. This paper describes an experiment in which sentence material from each of two male and two female talkers was analyzed using an autocorrelation method to obtain 12 log area ratios at 10 millisecond intervals. An eigenparameter analysis was then performed on the log area ratios. The eigenparameters were quantized in each of six ways based on their variance and standard deviation. Ten versions of speech were created for each talker: one was natural, three were synthesized from nonquantized parameters, and six were synthesized from quantized parameters. Each version of speech was paired with all other versions and listeners were asked which of the two versions in a pair they preferred. The same quantization schemes did not produce the same listener preferences for all talkers. Average spectral similarities of each quantized version (relative to a 12 parameter version) were also calculated using the minimum prediction residual and an ad hoc measure that maintains distance symmetry between “sample” and “template.” The results will be considered in terms of how well measures of spectral similarity correlate with listener preferences.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call