Adaptive Learning: A New Decentralized Reinforcement Learning Approach for Cooperative Multiagent Systems

Meng-Lin Li,Shaofei Chen,Jing Chen

doi:10.1109/access.2020.2997899

Meng-Lin Li, Shaofei Chen + Show 1 more

Open Access

https://doi.org/10.1109/access.2020.2997899

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 13	License type: CC BY 4.0

Affiliation: National University of Defense Technology

Abstract

Multiagent systems (MASs) have received extensive attention in a variety of domains, such as robotics and distributed control. This paper focuses on how independent learners (ILs, structures used in decentralized reinforcement learning) decide on their individual behaviors to achieve coherent joint behavior. To date, Reinforcement learning(RL) approaches for ILs have not guaranteed convergence to the optimal joint policy in scenarios in which communication is difficult. Especially in a decentralized algorithm, the proportion of credit for a single agent’s action in a multiagent system is not distinguished, which can lead to miscoordination of joint actions. Therefore, it is highly significant to study the mechanisms of coordination between agents in MASs. Most previous coordination mechanisms have been carried out by modeling the communication mechanism and other agent policies. These methods are applicable only to a particular system, so such algorithms do not offer generalizability, especially when there are dozens or more agents. Therefore, this paper mainly focuses on the MAS contains more than a dozen agents. By combining the method of parallel computation, the experimental environment is closer to the application scene. By studying the paradigm of centralized training and decentralized execution(CTDE), a multi-agent reinforcement learning algorithm for implicit coordination based on TD error is proposed. The new algorithm can dynamically adjust the learning rate by deeply analyzing the dissonance problem in the matrix game and combining it with a multiagent environment. By adjusting the dynamic learning rate between agents, coordination of the agents’ strategies can be achieved. Experimental results show that the proposed algorithm can effectively improve the coordination ability of a MAS. Moreover, the variance of the training results is more stable than that of the hysteretic Q learning(HQL) algorithm. Hence, the problem of miscoordination in a MAS can be avoided to some extent without additional communication. Our work provides a new way to solve the miscoordination problem for reinforcement learning algorithms in the scale of dozens or more number of agents. As a new IL structure algorithm, our results should be extended and further studied.

Highlights

In the past decade, multiagent systems (MASs) have attracted considerable attention in many fields, especially for intelligent multirobot systems, road traffic signal control, distributed system control [1], etc
The Hysteretic Q-learning (HQL) and adaptive Q-learning (AQL) algorithms both converge to an optimal Nash equilibrium
The final convergence results of HQL and AQL show that dynamic adjustment of the learning rate can give decentralized agents the ability to distinguish the merits of their actions

Summary

Introduction

Multiagent systems (MASs) have attracted considerable attention in many fields, especially for intelligent multirobot systems, road traffic signal control, distributed system control [1], etc. MASs are very convenient for practical applications. A decentralized MAS point of view offers several potential advantages, such as increased speed, scalability and robustness [2]. We focus on the coordination mechanism in a fully cooperative multiagent reinforcement learning algorithm. The associate editor coordinating the review of this manuscript and approving it for publication was Mostafa M.

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Adaptive Learning: A New Decentralized Reinforcement Learning Approach for Cooperative Multiagent Systems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

A reinforcement learning algorithm for obtaining the Nash equilibrium of multi-player matrix games
Vishnu Nanduri ... Tapas K Das
IIE Transactions | VOL. 41
Vishnu Nanduri, et. al.Vishnu Nanduri ... Tapas K Das
20 Nov 2009
IIE Transactions | VOL. 41

Active Screening for Recurrent Diseases: A Reinforcement Learning Approach
...
-
, et. al. ...
11 Apr 2021
11 Apr 2021

A selection-mutation model for q-learning in multi-agent systems
Karl Tuyls ... Tom Lenaerts
-
Karl Tuyls, et. al.Karl Tuyls ... Tom Lenaerts
14 Jul 2003
14 Jul 2003

A selection-mutation model for q-learning in multi-agent systems
Karl Tuyls ... Tom Lenaerts
-
Karl Tuyls, et. al.Karl Tuyls ... Tom Lenaerts
01 Jan 2003
01 Jan 2003

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Adaptive Learning: A New Decentralized Reinforcement Learning Approach for Cooperative Multiagent Systems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access