Automatic collective motion tuning using actor-critic deep reinforcement learning

Shadi Abpeikar,Kathryn Kasmarik,Matthew Garratt,Robert Hunjet,Huanneng Qiu,Md Mohiuddin Khan

doi:10.1016/j.swevo.2022.101085

Shadi Abpeikar, Kathryn Kasmarik + Show 4 more

https://doi.org/10.1016/j.swevo.2022.101085

Copy DOI

Abstract

Collective behaviours such as swarm formation of autonomous agents offer the advantages of efficient movement, redundancy, and potential for human guidance of a single swarm organism. However, tuning the behaviour of a group of agents so that they swarm, is difficult. Behaviour-bootstrapping algorithms permit agents to self-tune behaviour adapted for their physical form and associated movement constraints. This paper proposes a reinforcement learning framework to tune collective motion behaviours from random behaviours. The learning process is guided by a novel reward function capable of autonomously detecting generic collective motion behaviours from sensor data about the relative velocity and position of neighbouring agents. Our reward function is designed using a meta-learner trained on a human-labelled collective motion dataset. We demonstrate that our reinforcement learner can tune the behaviour of randomly moving groups so that structured collective motion emerges. We compare our framework to an existing developmental evolutionary framework for this purpose. Our results demonstrate that the proposed learning framework can generate behaviours with different collective motion characteristics more quickly than existing approaches. In addition, the trained reinforcement learner can tune the behaviour of robots with movement characteristics that it has not been trained on.

Full Text