Online joint learning of object concepts and language model using multimodal hierarchical Dirichlet process

Joe Nishihara,Takayuki Nagai,Tomoaki Nakamura,Tatsuya Aoki

doi:10.1109/iros.2016.7759410

Abstract

One of the biggest challenges in intelligent robotics is to build robots that can learn to use language. To this end, we think that the practical long-term on-line concept/word learning algorithm for robots is a key issue to be addressed. In this paper, we develop an unsupervised on-line learning algorithm that uses Bayesian nonparametrics for categorizing multimodal sensory signals such as audio, visual, and haptic information for robots. The robot uses its physical body to grasp and observe an object from various viewpoints as well as listen to the sound during the observation. The most important property of the proposed framework is to learn multimodal concepts and the language model simultaneously. This mutual learning framework of concepts and language significantly improves both speech recognition and multimodal categorization performances. We conducted a long-term experiment where a human subject interacted with a real robot over 100 hours using 499 objects. Some interesting results of the experiment are discussed in this paper.

Full Text