Bayesian-Game-Based Fuzzy Reinforcement Learning Control for Decentralized POMDPs

Rajneesh Sharma,Matthijs T J Spaan

doi:10.1109/tciaig.2012.2212279

Abstract

This paper proposes a Bayesian-game-based fuzzy reinforcement learning (RL) controller for decentralized partially observable Markov decision processes (Dec-POMDPs). Dec-POMDPs have recently emerged as a powerful platform for optimizing multiagent sequential decision making in partially observable stochastic environments. However, finding exact optimal solutions to a Dec-POMDP is provably intractable (NEXP-complete), necessitating the use of approximate/suboptimal solution approaches. This approach proposes an approximate solution by employing fuzzy inference systems (FISs) in a game-based RL setting. It uses the powerful universal approximation capability of fuzzy systems to compactly represent a Dec-POMDP as a fuzzy Dec-POMDP, allowing the controller to progressively learn and update an approximate solution to the underlying Dec-POMDP. The proposed controller envisages an FIS-based RL controller for Dec-POMDPs modeled as a sequence of Bayesian games (BGs). We implement the proposed controller for two scenarios: 1) Dec-POMDPs with free communication between agents; and 2) Dec-POMDPs without communication. We empirically evaluate the proposed approach on three standard benchmark problems: 1) multiagent tiger; 2) multiaccess broadcast channel; and 3) recycling robot. Simulation results and comparative evaluation against other Dec-POMDP solution approaches elucidate the effectiveness and feasibility of employing FIS-based game-theoretic RL for designing Dec-POMDP controllers.

Full Text