Deep Reinforcement Learning Algorithms for Multi-Agent Systems - A Solution for Modeling Epidemics

S Sethu Selvi,M P Venkatesh Bhardwaj,Dhananjaya Kumar M V,Kartik S Shetty,Lohith N V

doi:10.1109/mysurucon52639.2021.9641663

Abstract

Multi-agent reinforcement learning (MARL) consists of large number of artificial intelligence-based agents interacting with each other in the same environment, often collaborating towards a common end goal. In single-agent reinforcement learning system the change in the environment is only due to the actions of a particular agent. In contrast, a multi-agent environment is subject to the actions of all the agents involved. Multiagent systems can be deployed in various applications like stock trading to maximize profits in stock market, control and coordination of a swarm of robots, modeling of epidemics, autonomous vehicle and traffic control, smart grids and self-healing networks. It is not possible to solve these complex tasks with a pre-programmed single agent. Instead, the many agents should be trained to automatically search for a solution through reinforcement learning (RL) based algorithms. In general, arriving at a decision in a multi-agent system is almost close to impossible due to exponential increase of problem size with an increase in the number of agents. In this paper, multi-agent systems using Deep Reinforcement Learning (DRL) is explored with a possible application in modeling of epidemics. Different stochastic environments are considered, and various multi-agent policies are implemented using DRL. The performance of various MARL algorithms was evaluated against single agent RL algorithms under different environments. MARL agents were able to learn much faster compared to single RL agents with a more stable training phase. Mean Field Q-Learning was able to scale and perform much better even in the situation of hundreds of agents in the environment and is a sure candidate to model and predict the epidemics, in the existing frightening dangerous situation of corona pandemic.

Full Text