Policy iteration for bounded-parameter POMDPs

Yaodong Ni,Zhi-Qiang Liu

doi:10.1007/s00500-012-0932-3

Policy iteration for bounded-parameter POMDPs

Yaodong Ni, Zhi-Qiang Liu

https://doi.org/10.1007/s00500-012-0932-3

Copy DOI

Journal: Soft Computing	Publication Date: Sep 27, 2012
Citations: 6

Affiliation: University of International Business and Economics, City University of Hong Kong

#Optimistic Criterion #Optimal Policy + Show 8 more

Abstract
Full-Text
Similar Papers

Abstract

POMDP is considered as a basic model for decision making under uncertainty. As a generalization of the exact POMDP model, the bounded-parameter POMDP (BPOMDP) provides only upper and lower bounds on the state-transition probabilities, observation probabilities and rewards, which is particularly suitable for characterizing the situations where the underlying model is imprecisely given or time-varying. This paper presents the optimistic criterion for optimality for solving BPOMDPs, under which the optimistically optimal value function is defined. By representing a policy explicitly as a finite-state controller, we propose a policy iteration approach that is shown to converge to an $$\epsilon$$ -optimal policy under the optimistic optimality criterion.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Soft Computing

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.