Abstract

The acoustic output for a given configuration and excitation of the vocal tract during speech production can be described in a relatively compact manner by a transfer function characterized by a reasonably small number of poles and zeros. Thus the spectrum of a speech sound (log amplitude vs frequency) may be considered to be the sum of several elemental spectra, each associated with one pole or zero. The spectrum envelope of any vowel, for example, could be approximated reasonably accurately by combinations of spectra drawn from a catalog of 30 to 40 curves that represent simple resonances at a number of different frequencies. A digital computer has been programed to perform such an approximation on spectra of vowel and some consonant sounds sampled periodically in time. Successive comparisons are made between a particular speech spectrum and spectra constructed from a catalog of elementary curves that are stored in the memory. The computer is programed to converge rapidly from a first approximation of the spectrum based on previous examinations of adjacent samples towards the synthesized spectrum that yields the best fit with the input spectrum, and reads out numbers that identify the particular elemental spectra that are finally selected. [Supported by U. S. Army (S.C.), U. S. Air Force (O.S.R., A.R.D.C.), and U. S. Navy (Office of Naval Research), and by Air Force Cambridge Research Center Contract AF 19(604)-2061.]

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call