Abstract

Using synthetic speech from an articulatory speech synthesizer, statistics are generated of the error between actual articulatory configurations and those estimated by an acoustic‐to‐articulatory mapping routine. Based solely on acoustics, neglecting aerodynamic and perceptual issues, histograms of total estimation error suggest that the inverse problem is no more ambiguous for fricatives than for vowels. By examining the error covariance, dominant articulatory dimensions are identified in the fricative model that have the greatest effect on the acoustic transfer function and, as a result, are better estimated by the acoustic‐to‐articulatory mapping routine. Weak articulatory dimensions are also found that the acoustic‐to‐articulatory mapping routine can barely estimate better than simply guessing. Suggestions are made for ways in which these error statistics, and specifically the knowledge of the unequal importance of different articulatory dimensions, can be used to motivate improved techniques for the acoustic‐to‐articulatory mapping of speech. Demonstrations of some of these ideas will be given on real and synthetic speech. [Work supported by the AFOSR.]

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.