Abstract

This article develops a model-free adaptive optimal control policy for discrete-time Markov jump systems. First, a two-player zero-sum game is formulated to obtain an optimal control policy that minimizes a cost function against the worst-case disturbance. Second, an action and mode-dependent value function is set up for zero-sum game to search such a policy with convergence guarantee rather than solving an optimization problem satisfying coupled algebraic Riccati equations. To be specific, motivated by the Bellman optimal principle, we develop an online value iterations algorithm to solve the zero-sum game, which is learning while controlling without any initial stabilizing policy. By this algorithm, we can achieve disturbance attenuation for Markov jump systems without knowledge of the system matrices. The adaptivity to slowly changing uncertainties can also be achieved due to the model-free feature and policy convergence. Finally, the effectiveness and practical potential of the algorithm are demonstrated by considering two numerical examples and a solar boiler system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call