Abstract

Articulatory learning models utilize traditional neural network approaches to model speech. However,they differ in their inclusion of an intermediary step in which the model is trained to produce sound utilizing a physical model of the human vocal tract. This style of training is theoretically able to represent the impact of the physical nature of the human vocal tract on language in a way that is more directly translatable to the physical world. This project seeks to create an articulatory learning model which is trained to recognize and repeat input sounds utilizing the University of Wisconsin’s X-Ray Microbeam Database. We attempt this utilizing an inverse model which seeks to predict the vocal tract configuration of a given speech signal, and a forward model which predicts the acoustic form produced by vocal tract configurations. Further, we analyze the performance of the model across various training lengths to investigate the comparison to child language learning that is commonly made when implementing models of this type. Should this analysis of articulatory learners be valid, we look to find commonalities in the phonetic patterning of articulatory learning neural networks and those observed in the phonetic acquisition stages of child language development.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call