Abstract

Recently, deep reinforcement learning (RL) algorithms have achieved significant progress in the multi-agent domain. However, training for increasingly complex tasks would be time-consuming and resource intensive. To alleviate this problem, efficient leveraging of historical experience is essential, which is under-explored in previous studies because most existing methods fail to achieve this goal in a continuously dynamic system owing to their complicated design. In this paper, we propose a method for knowledge reuse called “KnowRU”, which can be easily deployed in the majority of multi-agent reinforcement learning (MARL) algorithms without requiring complicated hand-coded design. We employ the knowledge distillation paradigm to transfer knowledge among agents to shorten the training phase for new tasks while improving the asymptotic performance of agents. To empirically demonstrate the robustness and effectiveness of KnowRU, we perform extensive experiments on state-of-the-art MARL algorithms in collaborative and competitive scenarios. The results show that KnowRU outperforms recently reported methods and not only successfully accelerates the training phase, but also improves the training performance, emphasizing the importance of the proposed knowledge reuse for MARL.

Highlights

  • Reinforcement learning (RL) has made great progress in solving complicated tasks, such as Atari games [1], board games [2], and video-game playing [1]

  • We propose a method for knowledge reuse called KnowRU, which can be deployed in multi-agent RL (MARL) algorithms

  • Markov decision processes (MDPs) [5] in MARL can be denoted as a tuple, < S, U, T, R1...n, γ >, where S is the state space, U is the joint action space, T is the state transition function, Ri is the reward function of agent i, γ is the discount factor, and n is the number of agents

Read more

Summary

Introduction

Reinforcement learning (RL) has made great progress in solving complicated tasks, such as Atari games [1], board games [2], and video-game playing [1]. With the compelling performance of single-agent models, multi-agent RL (MARL) tasks, such as collaboration and competition among multiple agents, have piqued the interest of researchers in several fields [3,4], as the applications of MARL seems to be evident. Efficient transfer and use of knowledge between tasks can alleviate the aforementioned issues; sustainable efforts have been made in this field. One category of solutions employs the transfer learning paradigm to reuse the knowledge of historical experience, which can relieve the burden of training a new model with previous experience [5]

Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call