Abstract

Word discovery is a critical part of language acquisition for infants. Human infants can discover words from human speech signals directly. However, direct word discovery from raw speech signals is still a challenging problem for recent artificial intelligence technology. This paper describes our new experimental result based on the state-of-the-art machine learning method, which we previously proposed, that enables a robot to acquire acoustic and language models simultaneously. A robot can acquire them from raw speech signals without any transcribed data, i.e., in an unsupervised manner. The method is based on hierarchical Dirichlet process-hidden language model, which is a generative model that integrate language and acoustic models, and deep sparse autoencoder, which is an unsupervised deep learning method. We also report our new results about direct word discovery from vowel sequences. The results lead us to open discussion about the computational process of language acquisition in human development.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call