Automated design of action advising trigger conditions for multiagent reinforcement learning: A genetic programming-based approach

Tonghao Wang,Xingguang Peng,Tao Wang,Tong Liu,Demin Xu

doi:10.1016/j.swevo.2024.101475

Abstract

Action advising is a popular and effective approach to accelerating independent multiagent reinforcement learning (MARL), especially for the learning systems that all the agents learn from scratch and the roles of them (advisors or advisees) cannot be predefined. The key component of action advising is the trigger condition, which answers the question of when to advise. Previous works mainly focus on the design of novel trigger conditions manually; however, since those conditions are often designed heuristically, the performance may be affected by the preference of the designers. To this end, this paper tries to solve the action advising problem automatically using genetic programming (GP), an evolutionary computation technique. A framework incorporating GP to action advising is provided, together with a novel population initialization method to enhance the performance. Empirical studies are provided to demonstrate the effectiveness of the proposed framework. More importantly, thanks to the high transparency of GP, comprehensive analysis is also conducted based on the results. Interesting and inspiring insights to the action advising problem are condensed from the discussions, which may provide guidance to future works.

Full Text