Abstract

Recent advancements in deep reinforcement learning (DRL) have led to its application in multi-agent scenarios to solve complex real-world problems, such as network resource allocation and sharing, network routing, and traffic signal controls. Multi-agent DRL (MADRL) enables multiple agents to interact with each other and with their operating environment, and learn without the need for external critics (or teachers), thereby solving complex problems. Significant performance enhancements brought about by the use of MADRL have been reported in multi-agent domains; for instance, it has been shown to provide higher quality of service (QoS) in network resource allocation and sharing. This paper presents a survey of MADRL models that have been proposed for various kinds of multi-agent domains, in a taxonomic approach that highlights various aspects of MADRL models and applications, including objectives, characteristics, challenges, applications, and performance measures. Furthermore, we present open issues and future directions of MADRL.

Highlights

  • IntroductionMulti-agent deep reinforcement learning (MADRL) is a group of agents (or decision makers) that interact with each other and their operating environment to achieve goals in a cooperative or competitive manner

  • Multi-agent DRL (MADRL) extends the functions of the traditional reinforcement learning (RL) and multi-agent reinforcement learning (MARL) with deep learning (DL), which is the recent advancement of artificial intelligence

  • This paper extends the existing literature by providing a survey of MADRL algorithms applied to various state-of-the-art applications, and presenting a taxonomy of MADRL

Read more

Summary

Introduction

Multi-agent deep reinforcement learning (MADRL) is a group of agents (or decision makers) that interact with each other and their operating environment to achieve goals in a cooperative or competitive manner. MADRL extends the functions of the traditional reinforcement learning (RL) and multi-agent reinforcement learning (MARL) with deep learning (DL), which is the recent advancement of artificial intelligence. The traditional RL approach [13], which is formulated based on Markov decision process (MDP) [8,12,14,15], enables a single agent (or a decision maker) to interact with its operating environment in a trial and error manner, learn a policy (e.g., a control policy), and perform sequential decision making for optimizing system performance.

Objectives
Methods
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.