Abstract
Throughout his highly distinguished career, Ken Stevens studied how the relationship between the constrictions used in speech production result in the acoustic properties of the speech signal. Ken’s pioneering rate in articulation stemming from the X-ray study he participated in and studies by his students in the timing and movement of articulatory gestures played a pivotal role in our work to develop a Speech Inversion (SI) system. The SI system, using deep learning, initially used ground-truth information obtained from the Wisconsin X-ray Microbeam database to estimate the articulatory trajectories for the lips, tongue tip, and tongue dorsum. Adding glottal information based on an aperiodicity/periodicity/pitch detector resulted in a significant improvement in the correlation between estimated and ground truth data. More recently, we added ground truth articulatory data about the velopharyngeal port constriction, using both nasometry and the novel technique of high-speed nasopharyngoscopy. In this talk, I will discuss the SI system and its ability to help us understand how speech production changes as a result of mental state, speech disorder, speaking rate, and accent.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have