Abstract

Given a parametrized stabilizing controller, the approach presented in this work seeks to find optimal parameters with respect to an infinite-horizon cost. Since the latter is in general not computable, it is suggested to apply an adaptive actor-critic structure to approximate the respective value function. The actor is realized explicitly using the projected subgradient method. A particular challenge arises from the fact that the approximated value function is time-varying depending on the evolution of the dynamical system and critic’s approximation of the value function. Provided that a certain stability constraint is convex and under persistence of excitation conditions, it is shown that the actor and critic parameters converge to prescribed vicinities of the optimal values. The whole setup is done in continuous time. A computational study is presented.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.