Abstract

Several glottal flow models have been proposed for speech analysis and synthesis (e.g., LF, Rosenberg, R++, and Klatt). All these models do not use the same number of parameters, or the same name for similar parameters, and it appears difficult to compare their merits. Then, a unified framework for studying the time and frequency domain properties of glottal flow models is proposed. It is shown that all the models can be represented by a common set of five time-domain parameters: three scale parameters (T0, peak amplitude, open quotient), a shape parameter (asymmetry quotient), and a closure continuity parameter. A generating function is computed for each model by normalization of the model with respect to scale parameters and closure continuity parameter. The specific features of each model are represented in its generating function. The spectrum of generating functions is low pass, and its derivative can be characterized by a spectral maximum, coined ‘‘glottal formant.’’ The closure continuity parameter corresponds to a spectral tilt component. The scale parameters are interpreted using scaling properties of the Fourier transform. Then, the glottal flow spectra can be characterized by two breakpoints. Frequencies of these breakpoints can be computed analytically for each model parameter setting.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.