Abstract

Humans learn multisensory eye-hand coordination starting from infancy without supervision. For an example, they learn to track their hands by exploiting various sensory modalities, such as vision and proprioception. This integration occurs as they learn to perceive the world around them and their relationship to it. Most prior work has focused on the role of vision, as it is a primary sensory source for humans. However, it is interesting to study how vision and proprioception interact. We propose a system which combines visual and proprioceptive information to learn the eye-hand coordination skills that enable a robot to fixate its camera gaze on the end effector of its arm. In our model, visual cues are part of the feedback control loop, whereas proprioceptive cues are part of a feedforward control loop. Both controllers, as well as the sensory transform from raw visual information to a neural sensory representation are learned as the robot performs motor babbling movements. Visual information is encoded by sparse coding. The basis functions that emerge are similar to the receptive fields in the human visual cortex. An actor-critic reinforcement learning algorithm is used to drive eye motor neurons fusing visual and proprioceptive cues. We model and test the system in the iCub simulation environment. Our results suggest that these sensory modalities are capable of jointly learning model parameters to perform the tracking task. The evolved policy has characteristics that are qualitatively similar to the human oculomotor plant.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call