Abstract
Two data-driven strategies for value iteration in linear quadratic optimal control problems over an infinite horizon are proposed. The two architectures share common features, since they both consist of a purely continuous-time control architecture and are based on the forward integration of the Differential Riccati Equation (DRE). They profoundly differ, instead, in the estimation mechanism of the vector field of the underlying DRE from collected data: the first relies on a characterization of properties of the advantage function associated to the problem, whereas the second is inspired by tools from adaptive control theory and ensures semi-global exponential convergence to the optimal solution. Advantages and drawbacks of the architectures are discussed, while the performance is validated via a benchmark numerical example.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.