Abstract
The optimal control problem of discrete-time nonlinear systems depends on the solution of the Bellman equation. In this paper, an adaptive reinforcement learning (RL) method is developed to solve the complex Bellman equation, which balances value iteration (VI) and policy iteration (PI). By adding a balance parameter, an adaptive RL integrates VI and PI together, which accelerates VI and avoids the need of an initial admissible control. The convergence of the adaptive RL is proved by showing that it converges to the Bellman equation. Subsequently, the adaptive RL is realized by using the neural network (NN) approximation for value function and a least-squares scheme is developed for updating NN weights. Then, the convergence of NN-based adaptive RL is proved with considering NN approximation error. To further improve its performance, an adaptive rule is developed for tuning balance parameter in adaptive RL iteration by iteration. Finally, the effectiveness of the adaptive RL is validated with simulation studies.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Systems, Man, and Cybernetics: Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.