Abstract
Imitation is an example of social learning in which an individual observes and copies another’s actions. This paper presents a new method for using imitation as a way of enhancing the learning speed of individual agents that employ a well-known reinforcement learning algorithm, namely Q-learning. Compared with other research that uses imitation with reinforcement learning, our method uses imitation of purely observed behaviours to enhance learning, with no internal state access or sharing of experiences between agents. The paper evaluates our imitation-enhanced reinforcement learning approach in both simulation and with real robots in continuous space. Both simulation and real robot experimental results show that the learning speed of the group is improved.
Highlights
Social learning, which enables individuals to learn from others in a community, is an important mechanism for social animals
Imitation learning differs from other adaptive learning algorithms that have been used in robotic research, including reinforcement learning (Barto et al, 2004), evolutionary algorithms (Nolfi and Floreano, 2000) and supervised learning (Rumelhart et al, 1986), as learning by imitation is based upon social interactions
This paper presents a simple method for linking reinforcement learning with imitation
Summary
Social learning, which enables individuals to learn from others in a community, is an important mechanism for social animals. Imitation learning differs from other adaptive learning algorithms that have been used in robotic research, including reinforcement learning (Barto et al, 2004), evolutionary algorithms (Nolfi and Floreano, 2000) and supervised learning (Rumelhart et al, 1986), as learning by imitation is based upon social interactions. Another important aspect of imitation is that the only information transferred between agents is the set of observed actions. Compared to other research that uses imitation with reinforcement learning, our method uses imitation of purely observed behaviours to enhance learning, with no internal state access or sharing of experiences between agents. Both simulation and real robot experiment results show that the learning speed of the agents is improved
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have