Abstract

An inverse filter formulation is applied to the problem of formant extraction from continuous voiced speech. Although the basic analysis equations can be derived from a multitude of different formulations, the inverse filter is believed most appropriate because of the insight it lends to the problem as a frequency domain formulation. Starting from the basis analysis equations, a new algorithm is described for automatically extracting a set of raw data from which the first three formant trajectories can generally be defined by inspection. The raw data set is obtained by simple peak-picking of the reciprocal of the resulting inverse filter spectrum for each analysis frame. Although an automatic formant extraction algorithm is not considered here, for over 90% of the raw data, automatic extraction is trivial, namely F1(k) − F3(k) for frame k equal the first three raw data samples in increasing order of frequency. Examples illustrate the fact that the inverse filter algorithm is capable of correctly extracting closely spaced formant structure and fast transitions. Each analysis frame requires computation roughly equivalent to a single 256-point complex radix-2 FFT algorithm. The complete analysis procedure has been implemented on a small PDP-8 computer with 4 K of core. A more detailed discussion along with computer programs will be presented [SCRL monograph No. 7 (to be published)].

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call