Abstract

This paper investigates a hierarchical decision-making problem for two players governed by a continuous-time linear system. Such a problem is formulated as a Stackelberg game, in which one player, called leader, has the priority to make its decision first and the other player, called follower, reacts optimally to the leader’s decision subsequently. We first establish two Hamilton-Jacobi-Bellman (HJB) equations in coupled forms, and show that the solutions to these HJB equations not only stabilize the system but also constitute the Stackelberg equilibrium policy. Due to the difficulty to analytically solve the HJB equations, we develop a new partially model-free value iteration (VI) algorithm with a two-level decision-making structure. To implement the proposed VI algorithm, we employ neural networks (NNs) to approximate the value functions, and use a least-square method to update weights of NNs. Finally, one simulation example is presented to verify the effectiveness of the proposed algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call