Abstract
Considering the fact that in the real world, a certain agent may have some sort of advantage to act before others, a novel hierarchical optimal synchronization problem for linear systems, composed of one major agent and multiple minor agents, is formulated and studied in this article from a Stackelberg-Nash game perspective. The major agent herein makes its decision prior to others, and then, all the minor agents determine their actions simultaneously. To seek the optimal controllers, the Hamilton-Jacobi-Bellman (HJB) equations in coupled forms are established, whose solutions are further proven to be stable and constitute the Stackelberg-Nash equilibrium. Due to the introduction of the asymmetric roles for agents, the established HJB equations are more strongly coupled and more difficult to solve than that given in most existing works. Therefore, we propose a new reinforcement learning (RL) algorithm, i.e., a two-level value iteration (VI) algorithm, which does not rely on complete system matrices. Furthermore, the proposed algorithm is shown to be convergent, and the converged values are exactly the optimal ones. To implement this VI algorithm, neural networks (NNs) are employed to approximate the value functions, and the gradient descent method is used to update the weights of NNs. Finally, an illustrative example is provided to verify the effectiveness of the proposed algorithm.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Neural Networks and Learning Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.