Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning

Jacopo Castellini,Frans A Oliehoek,Shimon Whiteson,Rahul Savani

doi:10.1007/s10458-021-09506-w

Abstract

Recent years have seen the application of deep reinforcement learning techniques to cooperative multi-agent systems, with great empirical success. However, given the lack of theoretical insight, it remains unclear what the employed neural networks are learning, or how we should enhance their learning power to address the problems on which they fail. In this work, we empirically investigate the learning power of various network architectures on a series of one-shot games. Despite their simplicity, these games capture many of the crucial problems that arise in the multi-agent setting, such as an exponential number of joint actions or the lack of an explicit coordination mechanism. Our results extend those in Castellini et al. (Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS’19.International Foundation for Autonomous Agents and Multiagent Systems, pp 1862–1864, 2019) and quantify how well various approaches can represent the requisite value functions, and help us identify the reasons that can impede good performance, like sparsity of the values or too tight coordination requirements.

Highlights

Multi-agent reinforcement learning (MARL) uses reinforcement learning to train multiple agents for such systems, and can lead to flexible and robust solutions [3, 10, 23, 43, 47]
Definition 1 One-shot games: a one-shot game [33] consists of the tuple M = ⟨D, {Ai}ni=1, {Qi}ni=1⟩, where D = {1, ... , n} is the set of agents, Ai is the set of actions for agent i, and Qi is the reward function for agent i that depends only on the joint action a ∈ A = ×ni=1Ai performed by the full team of agents, which expresses how much reward agent i gets from the overall team decision1
Problem Statement: Given the original action-value function Q(a) and a learned representation Q (a), we are interested in investigating the quality of this learned representation, both in terms of action ranking, i.e., σ(R(Q), R(Q )), where is a similarity measure and R is a partial ordering of the joint actions according to their action-values, so that the learned function can reliably be used for decision making; and in terms of reconstruction error of the representation, computed using the mean squared error (MSE): MSE

Summary

Introduction

Multi-agent reinforcement learning (MARL) uses reinforcement learning to train multiple agents for such systems, and can lead to flexible and robust solutions [3, 10, 23, 43, 47]. A variety of deep MARL approaches have been developed and successfully applied [7, 26, 41] While these approaches have shown good results, there is a general lack of theoretical insight, and often it remains unclear what the neural networks used by these approaches are learning, or how we should enhance their learning power to address the problems on which they fail. This approach heavily relies on the function approximation abilities of the neural network, since it must generalize across a discrete action space whose size is exponential in the number of agents. Definition 2 Cooperative one-shot game: a cooperative one-shot game is a game in which all agents share the same reward function Q(a), so that the goal of the team is to maximize this shared reward by finding the optimal joint action a ∈ A to perform.

Objectives

Methods

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Autonomous Agents and Multi-Agent Systems	Publication Date: Jun 7, 2021
Citations: 2	License type: open-access

R Discovery Prime

R Discovery Prime

Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Autonomous Agents and Multi-Agent Systems

Lead the way for us

Similar Papers

Report on the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2005)
...
AI Magazine | VOL. 27
, et. al. ...
15 Mar 2006
Report on the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2005)
...

Grounded action transformation for sim-to-real reinforcement learning
Josiah P Hanna ... Siddharth Desai
Machine Learning | VOL. 110
Josiah P Hanna, et. al.Josiah P Hanna ... Siddharth Desai
13 May 2021
Machine Learning | VOL. 110

The Representational Capacity of Action-Value Networks for Multi-Agent Reinforcement Learning
...
-
, et. al. ...
08 May 2019
08 May 2019

Learning to Communicate and Act Using Hierarchical Reinforcement Learning
...
-
, et. al. ...
10 May 2010
10 May 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Autonomous Agents and Multi-Agent Systems