Abstract

Humans can divide the perceived continuous speech signals, which exhibit double articulation structure, into phonemes and words without explicit boundary points or labels and thus learn a language. In constructive developmental studies, learning the double articulation structure of speech signals is important for realizing robots with human-like language learning abilities. In this study, we propose a novel probabilistic generative model called the Gaussian process-hidden semi-Markov model-based double articulation analyzer (GP-HSMM-DAA), which can learn phonemes and words from continuous speech signals by hierarchically connecting two probabilistic generative models (PGMs), namely, the Gaussian process-hidden semi-Markov model and hidden semi-Markov model. In the proposed model, the parameters of each PGM are mutually and complementarily updated and learned, enabling accurate learning of the phonemes and words. The experimental results reveal that GP-HSMM-DAA can segment continuous speech into phonemes and words with higher accuracy than the conventional method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call