Abstract

One of the interesting topics in recent speech analysis techniques is the possibility of estimating vocal-tract shapes and hopefully extracting some articulatory parameters from speech sounds. Estimation of vocal-tract shapes either from acoustic data or directly from acoustic speech waveforms has been investigated on the basis of the linear prediction model of speech analysis (LPC model). Although the precise reconstruction of the original smooth shapes from the bandlimited speech signal is known to be theoretically impossible, it is expected that fairly reasonable estimation of the shapes is possible in terms of discrete acoustic tubes under certain assumptions and constraints. Uncertainty due to individual voice source characteristics can be avoided to a large extent by estimating the formant frequencies and bandwidths during the glottis-closed portion of speech waveforms. The differences between the LPC model and the actual speech production mechanism can be adjusted by modifying the measured formant frequencies and bandwidths by use of a realistic speech production model. Merits and technical problems involved in the method will be discussed, and some analysis examples will be given.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call