Attention-Based Fault-Tolerant Approach for Multi-Agent Reinforcement Learning Systems.

Shanzhi Gu,Long Lan,Mingyang Geng

doi:10.3390/e23091133

Abstract

The aim of multi-agent reinforcement learning systems is to provide interacting agents with the ability to collaboratively learn and adapt to the behavior of other agents. Typically, an agent receives its private observations providing a partial view of the true state of the environment. However, in realistic settings, the harsh environment might cause one or more agents to show arbitrarily faulty or malicious behavior, which may suffice to allow the current coordination mechanisms fail. In this paper, we study a practical scenario of multi-agent reinforcement learning systems considering the security issues in the presence of agents with arbitrarily faulty or malicious behavior. The previous state-of-the-art work that coped with extremely noisy environments was designed on the basis that the noise intensity in the environment was known in advance. However, when the noise intensity changes, the existing method has to adjust the configuration of the model to learn in new environments, which limits the practical applications. To overcome these difficulties, we present an Attention-based Fault-Tolerant (FT-Attn) model, which can select not only correct, but also relevant information for each agent at every time step in noisy environments. The multihead attention mechanism enables the agents to learn effective communication policies through experience concurrent with the action policies. Empirical results showed that FT-Attn beats previous state-of-the-art methods in some extremely noisy environments in both cooperative and competitive scenarios, much closer to the upper-bound performance. Furthermore, FT-Attn maintains a more general fault tolerance ability and does not rely on the prior knowledge about the noise intensity of the environment.

Highlights

Consider the following robotic search and rescue scenario: A group of UnmannedAerial Vehicles (UAVs) is sent to find the survivors in a group of high-rise buildings after an earthquake [1]
FT-Attn can be directly utilized to learn in various kinds of noisy environments with no need to tune the configuration of the model, while the Multi-agent Deep Deterministic Policy Gradient (MADDPG)-M has to adjust the amount of information to be shared among agents
We believe that adding our idea of fault tolerance will make the existing models much more valuable and practical

Summary

Introduction

Aerial Vehicles (UAVs) is sent to find the survivors in a group of high-rise buildings after an earthquake [1]. A proper information filtering mechanism needs to be designed for the agents to extract correct, and relevant information from others and model the environment without the restriction of noise intensity. The model should maintain the ability to deal with the complex cases where an agent needs to reach multiple agents’ correct observations at the same time without the prior knowledge of the noise intensity of the environments. FT-Attn can be directly utilized to learn in various kinds of noisy environments with no need to tune the configuration of the model, while the MADDPG-M has to adjust the amount of information to be shared among agents.

Related Work

MARL Models Designed for Normal Environments

MARL Models Designed for Noisy Environments

Partially Observable Markov Games

Policy Gradient and Actor–Critic

Our Approach

Problem Formulation

Attention-Based Fault-Tolerant Mechanism

Training Details of FT-Attn

Experimental Setting and Baseline Methods

Performance Comparison in a Modified Cooperative Navigation Scenario

Method

Performance Comparison in the Modified Predator and Prey Scenario

Attention Visualization

Findings

Discussion

Conclusions

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Entropy	Publication Date: Aug 31, 2021
Citations: 7	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Attention-Based Fault-Tolerant Approach for Multi-Agent Reinforcement Learning Systems.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Similar Papers

An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games
Karl Tuyls ... Bram Vanschoenwinkel
Autonomous Agents and Multi-Agent Systems | VOL. 12
Karl Tuyls, et. al.Karl Tuyls ... Bram Vanschoenwinkel
12 Sep 2005
Autonomous Agents and Multi-Agent Systems | VOL. 12

The Research of Signal Detection Method Based on LFMCW Lidar
Bo Xia ... Zhao Yang Zeng
Applied Mechanics and Materials | VOL. 239-240
Bo Xia, et. al.Bo Xia ... Zhao Yang Zeng
01 Dec 2012
Applied Mechanics and Materials | VOL. 239-240

Multi-agent reinforcement learning system to find efficient courses for ships
Masahiro Nakayama ... Hisato Fujisaka
-
Masahiro Nakayama, et. al.Masahiro Nakayama ... Hisato Fujisaka
01 Nov 2014
01 Nov 2014

Evaluating fault tolerance approaches in multi-agent systems
Rade Stanković ... Maja Štula
Autonomous Agents and Multi-Agent Systems | VOL. 31
Rade Stanković, et. al.Rade Stanković ... Maja Štula
23 Nov 2015
Autonomous Agents and Multi-Agent Systems | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Attention-Based Fault-Tolerant Approach for Multi-Agent Reinforcement Learning Systems.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Entropy