Abstract

In order to overcome numerical stability problems that inherently occur in the recursive least-squares (RLS)-based adaptive dynamic programming paradigms for online optimal control design, a novel method to promote improvements in the state-value function approximations for online algorithms of the discrete linear quadratic regulator (DLQR) control system design is proposed. The algorithms resulting from that methodology are embedded in actor-critic architectures based on heuristic dynamic programming (HDP). The proposed solution is grounded on unitary transformations and QR decomposition, which are integrated in the critic network, to improve the RLS learning efficiency for online realization of the HDP-DLQR control design. In terms of numerical stability and computational cost, the developed learning strategy is designed to provide computational performance improvements, which aim at making possible the real time implementations of optimal control design methodology based upon actor-critic reinforcement learning paradigms. The convergence behavior and numerical stability of the proposed online algorithm are evaluated by computational simulations in two multiple-input and multiple-output models that represent a fourth order RLC circuit with two input voltages and two controllable voltage levels, and a doubly-fed induction generator with six inputs and six outputs for wind energy conversion systems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.