Abstract

In this paper, the complex partially observable Markov decision process (POMDP) systems with discrete state and action spaces are studied from a large-scale system point of view. By introducing the hierarchical control methods, the complex high-dimensional POMDP system can be decomposed into some low-dimensional subsystems without the related constraints. The optimization problem of each subsystem can be solved independently with the simulation-based policy iteration algorithm on the basis of sensitivity analysis. The computational overhead for optimizing the entire system can be significantly reduced. This algorithm does not need any overly strict assumption and can be applied to most of the practical problems. One numerical example is provided to illustrate the applicability of the algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call