This paper investigates the optimal control problems for the finite-horizon continuous-time Markov decision processes with delay-dependent control policies. We develop compactification methods in decision processes and show that the existence of optimal policies. Subsequently, through the dynamic programming principle of the delay-dependent control policies, the differential-difference Hamilton-Jacobi-Bellman (HJB) equation in the setting of discrete space is established. Under certain conditions, we give the comparison principle and further prove that the value function is the unique viscosity solution to this HJB equation. Based on this, we show that among the class of delay-dependent control policies, there is an optimal one which is Markovian.
Read full abstract