Abstract

Estimation of formant contours from continuous speech is a difficult problem that has required either elaborate heuristic rules or complex hidden Markov models. Under high noise levels, these methods deteriorate significantly because of the degraded performance of the spectral estimator on which they are based. This paper proposes a method of formant contour estimation using a measure of confidence for the spectral peak estimation. At high noise levels, most spectral estimation techniques result in large variance and increased spurious/missing peaks; however, it is found that a broad distribution of spectral energy can be determined more reliably using filter-bank analysis. This spectral energy distribution is utilized to differentiate regions of high and low signal-to-noise ratio in the spectrum (for white noise corruption), which in turn is used to determine a “confidence measure” for the spectral peak estimation. The addition of this confidence information to the noisy spectral peak data of a speech utterance results in clear segments of the formant contour standing out from the remaining noisy peak data. Using these segments as anchor regions, an algorithm is developed to determine the complete contours, within the constraints of the known speech properties. [Work supported by NSF.]

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call