A distributed adaptive policy gradient method based on momentum for multi-agent reinforcement learning

Junru Shi,Xin Wang,Mingchuan Zhang,Muhua Liu,Junlong Zhu,Qingtao Wu

doi:10.1007/s40747-024-01529-6

Abstract

Policy Gradient (PG) method is one of the most popular algorithms in Reinforcement Learning (RL). However, distributed adaptive variants of PG are rarely studied in multi-agent. For this reason, this paper proposes a distributed adaptive policy gradient algorithm (IS-DAPGM) incorporated with Adam-type updates and importance sampling technique. Furthermore, we also establish the theoretical convergence rate of O(1/T)\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$\\mathcal {O}(1/\\sqrt{T})$$\\end{document}, where T represents the number of iterations, it can match the convergence rate of the state-of-the-art centralized policy gradient methods. In addition, many experiments are conducted in a multi-agent environment, which is a modification on the basis of Particle world environment. By comparing with some other distributed PG methods and changing the number of agents, we verify the performance of IS-DAPGM is more efficient than the existing methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A distributed adaptive policy gradient method based on momentum for multi-agent reinforcement learning

Abstract

Talk to us

Similar Papers

More From: Complex & Intelligent Systems

Lead the way for us

Journal: Complex & Intelligent Systems	Publication Date: Jul 12, 2024
License type: cc-by

Similar Papers

Distributional Policy Gradient With Distributional Value Function.
Qi Liu ... Yunjiang Lou
IEEE transactions on neural networks and learning systems | VOL. PP
Qi Liu, et. al.Qi Liu ... Yunjiang Lou
01 Jan 2024
IEEE transactions on neural networks and learning systems | VOL. PP

Policy Gradient Methods: Variance Reduction and Stochastic Convergence

-

01 Mar 2005
01 Mar 2005

Towards Generalization and Efficiency in Reinforcement Learning

-

02 Jul 2019
02 Jul 2019

Reducing Sampling Error in Policy Gradient Learning
...
-
, et. al. ...
08 May 2019
08 May 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A distributed adaptive policy gradient method based on momentum for multi-agent reinforcement learning

Abstract

Talk to us

Similar Papers

More From: Complex & Intelligent Systems