Abstract
This research considers the acoustic-to-articulatory mapping of fricative speech. By incorporating additional knowledge about fricative production, perception, and dynamics into the mapping scheme, improved performance may be achieved over standard acoustic-to-articulatory mapping techniques. A hybrid time-frequency domain articulatory synthesizer and numerical optimization techniques are applied to steady-state, unvoiced fricatives. The work differs from the fricative inverse mapping experiments of Sorokin [V. Sorokin, Speech Commun. 14, 249–262 (1994)] and Shirai [K. Shirai and S. Masaki, Speech Commun. 2, 111–114 (1983)] in that a fricative-specific linked codebook is used to initialize optimization and amplitude sensitive distance measures are used. The fricative codebook produces starting configurations for numerical optimization based on their spectral characteristics and their ability to match frication source characteristics for a given lung pressure and glottal configuration. The use of amplitude sensitive spectral distance measures is necessary to account for the interaction between the tract and the flow-dependent frication source. Examples of acoustic-to-articulatory mapping are given that compare fricative articulatory models and fricative spectral distance measures for the inverse mapping scheme. Acoustic-to-articulatory performance is evaluated in terms of spectral distance and clustering ability. [Work supported by the AFOSR.]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.