Abstract

For the nonzero-sum games, players have taken different strategies to achieve the Nash equilibrium. However, regarding hierarchical optimization and asymmetric information, Nash equilibrium is ineffective. The Stackelberg game provides us with a leader–follower strategy to cope with this problem. This paper solved two-player continuous-time nonlinear Stackelberg differential games with unknown system dynamics using an off-policy integral reinforcement learning (IRL) technique. A two-level hierarchy optimal control problem is the Stackelberg differential game, and locating the open-loop Stackelberg equilibrium is equivalent to solving the coupled Hamiltonian-Jacobi (HJ) equations. First, the optimal strategies for the leader and the follower are derived. Then, the closed-loop system’s asymptotic stability of optimal control solution is proved. Based on the policy iteration (PI) algorithm, an off-policy IRL algorithm is developed to solve the Stackelberg game for completely unknown system dynamics. The control strategies and cost functions of the leader and follower are approximated using neural networks (NNs). It is demonstrated that NNs weight errors of off-policy algorithm are uniformly ultimately bounded (UUB). Lastly, two simulation examples demonstrate that the proposed learning algorithm is capable of obtaining two-player Stackelberg equilibrium strategies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call