Abstract

Partially observable Markov decision processes have been widely adopted in the automatic planning literature since it elegantly captures both execution and observation uncertainties. In our previous paper, we proposed a model called vector autoregressive partially observable Markov decision process (VAR-POMDP) which extends the traditional POMDP by considering the temporal correlation among continuous observations. However, it is a non-trivial problem to develop a tractable planning algorithm for the VAR-POMDP model with performance guarantees as most existing algorithms need to explicitly enumerate all possible observation histories, which is in an unbounded continuous space. In this letter, we extend the famous point-based value iteration algorithm to a double point-based value iteration and show that the VAR-POMDP model can be solved by dynamic programming through approximating the exact value function by a class of piece-wise linear functions. Meanwhile, we prove that the approximation error is bounded. The effectiveness of the proposed planning algorithm is illustrated by an example.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.