Abstract

The locations of formants in a speech signal are usually estimated by computing the linear predictive coefficients (LPC) over a sliding window and estimating the peaks in the spectrum of the resulting all-pole LPC filter. The peaks are estimated either by solving for the roots of the LPC polynomial or by computing its DFT and finding the peaks in the magnitude spectrum. Three different sources of errors in this analysis were investigated using synthesized vowels: (1) F0 quantization: In addition to the expected result that error increases with F0, it was found that the absolute error decreases with higher formant frequencies. (2) LPC+root solving: The location and bandwidth of a peak are usually estimated from the location of a pole on the z plane. It was found that this approximation is invalid if the formants have high bandwidths or if they are either too close or too far from each other. (3) LPC+DFT: In order to compensate for the quantization error introduced by the short-term DFT, a three-point parabolic interpolation scheme is usually employed for better estimation of peak locations. It was found that this compensation scheme is effective only for high-bandwidth formants. [Work supported by NIMH.]

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.