Abstract

The goal of this article is to investigate new and simple convergence analysis of dynamic programming for the linear–quadratic regulator problem of discrete-time linear time-invariant systems. In particular, bounds on errors are given in terms of both matrix inequalities and matrix norm. Under a mild assumption on the initial parameter, we prove that the <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$Q$</tex-math></inline-formula> -value iteration exponentially converges to the optimal solution. Moreover, a global asymptotic convergence is also presented. These results are then extended to the policy iteration. We prove that in contrast to the <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$Q$</tex-math></inline-formula> -value iteration, the policy iteration always converges exponentially fast. An example is given to illustrate the results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call