AbstractRecently, biologically inspired robots have been developed to acquire the capacity for directing visual attention to salient stimulus generated from the audiovisual environment. For the purpose of realizing this behavior, a general method is to calculate saliency maps to represent how much the external information attracts the robot's visual attention, where the audiovisual information and robot's motion status should be involved. In this paper, we represent a visual attention model where three modalities—audio information, visual information, and robot's motor status—are considered, because previous research has not considered all of them. First, we introduce a 2D density map, on which the value denotes how much the robot pays attention to each spatial location. Then we model the attention density using a Bayesian network where the robot's motion statuses are involved. Next, the information from both audio and visual modalities is integrated with the attention density map in integrate‐fire neurons. The robot can direct its attention to the locations where the integrate‐fire neurons are fired. Finally, the visual attention model is applied to make the robot select the visual information from the environment, and react to the content selected. Experimental results show that it is possible for robots to acquire the visual information related to their behaviors by using the attention model considering motion statuses. The robot can select its behaviors to adapt to the dynamic environment as well as to switch to another task according to the recognition results of visual attention. © 2006 Wiley Periodicals, Inc. Electr Eng Jpn, 158(2): 39–48, 2007; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/eej.20335
Read full abstract