Abstract

Normalization algorithms which seek to reduce the dispersion of phonetically similar vowels produced by vocal tracts of unequal size exclusively on the basis of formant frequency data ignore a number of perceptually relevant aspects of the acoustic signal such as fundamental frequency, formant bandwidth, and spectral rolloff. These so-called secondary characteristics of the speech signal interact with certain nonlinear properties of the inner and middle ear to produce an output function not predictable from formant frequencies alone. The model proposed here uses F1, F2, F3. and F0 as input to derive an output from a set of transformations based on a set of (a) acoustic parameters (e.g., formant bandwidth, spectral rolloff) and (b) nonlinear properties associated with the inner and middle ear (e.g., critical bands, mechanical sensitivity). The operational characteristics of the model suggest that differences in vocal tract size can, in large part, be offset by appropriate modification of fundamental frequency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call