Abstract

This paper integrates recent work on Path Integral (PI) and Kullback Leibler (KL) divergence stochastic optimal control theory with earlier work on risk sensitivity and the fundamental dualities between free energy and relative entropy. We derive the path integral optimal control framework and its iterative version based on the aforemetioned dualities. The resulting formulation of iterative path integral control is valid for general feedback policies and in contrast to previous work, it does not rely on pre-specified policy parameterizations. The derivation is based on successive applications of Girsanov's theorem and the use of Radon-Nikodým derivative as applied to diffusion processes due to the change of measure in the stochastic dynamics. We compare the PI control derived based on Dynamic Programming with PI based on the duality between free energy and relative entropy. Moreover we extend our analysis on the applicability of the relationship between free energy and relative entropy to optimal control of markov jump diffusions processes. Furthermore, we present the links between KL stochastic optimal control and the aforementioned dualities and discuss its generalizability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call