Abstract

The problem of determining the area function from the speech signal has long been known to be an ill-posed problem—there are many area functions that correspond to the same speech spectrum. To make the problem well-posed, the inversion process must be constrained. In this work, inversion is performed for a dynamic formant pattern of a CV or a VV transition as a whole, rather than for a single static pattern. The input to the process are formant values along with the first and second time derivatives of each formant for each frame of the formant pattern. This provides more information about the possible area function solutions, and thereby acts as a constraint. Also the area function change during the transition is constrained to be stationary at one location in the vocal tract, while maximal change occurs at only two other locations, with the area increasing at one of the locations and decreasing at the other. This constraint is derived from an empirical study of area function change. These constraints are implemented as constraints on a Riccati recursion for the reflection coeficients. Comparison with other work on dynamic constraints on speech inversion will be provided. [Work supported by NIH.]

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call