Human instruction recognition and self behavior acquisition based on state value

Yasutake Takahashi,Minoru Asada,Yoshihiro Tamura

doi:10.1109/fuzzy.2009.5277091

Abstract

A robot working with humans or other robots is supposed to be adaptive to changes in the environment. Reinforcement learning has been studied well for motor skill learning, robot behavior acquisition and adaptation of the behavior to the environmental changes. However, it is not practical that the robot learns and adapts its behavior only through trial and error by itself from scratch because huge exploration is needed. Fortunately, it is nothing unusual to have predecessors in the environment and it is reasonable to learn something from the observation of predecessors' behavior. In order to learn various behavior from the observation, the robot must segment the behavior based on reasonable criterion for itself and feedback the data to behavior learning by itself. This paper presents a case study for a robot to understand unfamiliar behavior shown by a human instructor through the collaboration between behavior acquisition and recognition of observed behavior, where the state value has an important role not simply for behavior acquisition (reinforcement learning) but also for behavior recognition (observation). The validity of the proposed method is shown by applying it to a dynamic environment where one robot and one human play soccer.

Full Text