Abstract

Humans can divide perceived continuous speech signals into phonemes and words, which have a double articulation structure, without explicit boundary points and labels, and learn the language. Learning such a double articulation structure of speech signals is important for realizing a robot that can acquire vocabulary and have a conversation. In this paper, we propose a novel statistical model GP-HSMM-DAA (Gaussian Process Hidden Semi Markov Model-based Double Articulation Analyzer) that can learn double articulation structures of time-series data by connecting statistical models hierarchically. In the proposed model, the parameters of each statistical model are mutually updated and learned complementarily. We present that GP-HSMM-DAA can segment continuous speech into phonemes and words with higher accuracy than the baseline methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call