Near Optimality of Finite Memory Policies for POMPDs with Continuous Spaces

Ali Devran Kara,Erhan Bayraktar,Serdar Yuksel

doi:10.1109/cdc51059.2022.9993165

Abstract

We study an approximation method for partially observed Markov decision processes (POMDPs) with continuous spaces. Belief MDP reduction has been the standard approach to study POMDPs, which, due to its uncountable state space and strict regularity properties however, requires rigorous approximation methods for practical applications. In this work, we focus on an approximation procedure via discretizing the observation space and constructing a fully observed finite MDP model using a finite length history of the discrete observations and control actions. We show that the resulting policy is nearly optimal under some regularity assumptions on the channel, and under certain controlled filter stability requirements for the hidden state process. We also provide a Q learning algorithm that uses a finite memory of discretized information variables, and prove its convergence to the optimality equation of the finite fully observed MDP constructed using the approximation method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Near Optimality of Finite Memory Policies for POMPDs with Continuous Spaces

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Reinforcement Learning for Jointly Optimal Coding and Control Policies for a Markovian System Controlled over a Communication Channel
Evelyn Hubbard ... Liam Cregg
Inquiry@Queen's Undergraduate Research Conference Proceedings | VOL. 18
Evelyn Hubbard, et. al.Evelyn Hubbard ... Liam Cregg
09 Sep 2024
Inquiry@Queen's Undergraduate Research Conference Proceedings | VOL. 18

Convergence and Near Optimality of Q-Learning with Finite Memory for Partially Observed Models
Ali Devran Kara ... Serdar Yuksel
-
Ali Devran Kara, et. al.Ali Devran Kara ... Serdar Yuksel
14 Dec 2021
14 Dec 2021

Observation-Based Optimization for POMDPs With Continuous State, Observation, and Action Spaces
Xiaofeng Jiang ... Jian Yang
IEEE Transactions on Automatic Control | VOL. 64
Xiaofeng Jiang, et. al.Xiaofeng Jiang ... Jian Yang
01 May 2019
IEEE Transactions on Automatic Control | VOL. 64

Continuous State-Action-Observation POMDPs for Trajectory Planning with Bayesian Optimisation
Philippe Morere ... Roman Marchant
-
Philippe Morere, et. al.Philippe Morere ... Roman Marchant
01 Oct 2018
01 Oct 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Near Optimality of Finite Memory Policies for POMPDs with Continuous Spaces

Abstract

Talk to us

Similar Papers