Abstract

This paper proposes a continuous vowel imitation system that explains the process of phoneme acquisition by infants from the dynamical systems perspective. Almost existing models concerning this process dealt with discrete phoneme sequences. Human infants, however, have no knowledge of phoneme innately. They perceive speech sounds as continuous acoustic signals. The imitation target of this study is continuous acoustic signals including unknown numbers and kinds of phonemes. The key ideas of the model are (1) the use of a physical vocal tract model called the Maeda model for embodying the motor theory of speech perception, (2) the use of a dynamical system called the Recurrent Neural Network with Parametric Bias (RNNPB) trained with both dynamics of the acoustic signals and articulatory movements of the Maeda model, and (3) the segmenting method of a temporal sequence using the prediction error of the RNNPB model. The experiments of our model demonstrated following results: (a) the self-organization of the vowel structure into attractors of RNNPB model, (b) the improvement of vowel imitation using movement of the Maeda model, and (c) the generation of clear vowels based on the bubbling process trained with a few random utterances. These results suggest that our model reflects the process of phoneme acquisition.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call