Abstract

The main goal of this paper is to study the adaptive infinite-horizon discounted continuous-time optimal control problem of piecewise deterministic Markov processes (PDMPs) with the control acting continuously on the jump intensity λ and on the transition measure Q of the process. It is assumed that jump parameters (λ and Q), as well the continuous and boundary costs (Cg and Ci respectively), depend on an unknown parameter β⁎. It is shown that the principle of estimation and control holds, that is, the strategy consisting of choosing, at each stage n, an action according to an optimal stationary policy, where the true but unknown parameter β⁎ is replaced by its estimated value βˆn, is asymptotically discount optimal, provided that the sequence of estimators {βˆn} of β⁎ is strongly consistent, that is, βˆn converge to β⁎ almost surely. In the framework of PDMPs, the so-called discrepancy function depends on the derivative along the flow of the value function as well as on some boundary conditions, which brings new challenges in the analysis of this problem.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call