Mixed Reinforcement Learning for Partially Observable Markov Decision Process

Le Tien Dung Le Tien Dung,Takashi Komeda,Motoki Takagi

doi:10.1109/cira.2007.382910

Abstract

Reinforcement Learning has been widely used to solve problems with a little feedback from environment. Q learning can solve full observable Markov Decision Processes quite well. For Partially Observable Markov Decision Processes (POMDPs), a Recurrent Neural Network (RNN) can be used to approximate Q values. However, learning time for these problems is typically very long. In this paper, Mixed Reinforcement Learning is presented to find an optimal policy for POMDPs in a shorter learning time. This method uses both a Q value table and a RNN. Q value table stores Q values for full observable states and the RNN approximates Q values for hidden states. An observable degree is calculated for each state while the agent explores the environment. If the observable degree is less than a threshold, the state is considered as a hidden state. Results of experiment in lighting grid world problem show that the proposed method enables an agent to acquire a policy, as good as the policy acquired by using only a RNN, with better learning performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Mixed Reinforcement Learning for Partially Observable Markov Decision Process

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Solving POMDPs with Automatic Discovery of Subgoals
...
-
, et. al. ...
01 Jan 2009
01 Jan 2009

REINFORCEMENT LEARNING FOR POMDP USING STATE CLASSIFICATION
Le Tien Dung ... Motoki Takagi
Applied Artificial Intelligence | VOL. 22
Le Tien Dung, et. al.Le Tien Dung ... Motoki Takagi
26 Aug 2008
Applied Artificial Intelligence | VOL. 22

A Bayesian game based adaptive fuzzy controller for multiagent POMDPs
Rajneesh Sharma ... Matthijs T J Spaan
-
Rajneesh Sharma, et. al.Rajneesh Sharma ... Matthijs T J Spaan
01 Jul 2010
01 Jul 2010

Task-Aware Verifiable RNN-Based Policies for Partially Observable Markov Decision Processes
Steven Carr ... Ufuk Topcu
Journal of Artificial Intelligence Research | VOL. 72
Steven Carr, et. al.Steven Carr ... Ufuk Topcu
18 Nov 2021
Journal of Artificial Intelligence Research | VOL. 72

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mixed Reinforcement Learning for Partially Observable Markov Decision Process

Abstract

Talk to us

Similar Papers