Abstract

This paper develops an integral value iteration (VI) method to efficiently find online the Nash equilibrium solution of two-player non-zero-sum (NZS) differential games for linear systems with partially unknown dynamics. To guarantee the closed-loop stability about the Nash equilibrium, the explicit upper bound for the discounted factor is given. To show the efficacy of the presented online model-free solution, the integral VI method is compared with the model-based off-line policy iteration method. Moreover, the theoretical analysis of the integral VI algorithm in terms of three aspects, i.e., positive definiteness properties of the updated cost functions, the stability of the closed-loop systems, and the conditions that guarantee the monotone convergence, is provided in detail. Finally, the simulation results demonstrate the efficacy of the presented algorithms.

Highlights

  • Game theory is a powerful and natural framework to represent the interactions among multiple players, where each player seeks to maximize its own interest

  • Non-zero-sum (NZS) games can take into account both individual self-interests, as well as global group interest, such as mixed H2/H∞ control [4], etc

  • For NZS differential games, on the other hand, the Nash equilibrium solution is found by solving coupled Hamilton-Jacobi equations (HJEs) for nonlinear systems and coupled algebraic Riccati equations (CAREs) for linear systems [11], [12]

Read more

Summary

INTRODUCTION

Game theory is a powerful and natural framework to represent the interactions among multiple players, where each player seeks to maximize its own interest. For NZS differential games, on the other hand, the Nash equilibrium solution is found by solving coupled HJEs for nonlinear systems and CAREs for linear systems [11], [12]. A novel integral VI method is developed to obviate the requirement of initial admissible policy while guaranteeing the closed-loop stability during the learning process. It is desired to obviate the admissibility of initial policy while guaranteeing the closed-loop stability for model-free ADP/RL methods. The main contributions of this paper are summarized as follows: 1) A novel data-driven value iteration algorithm is developed for solving the NZS games for linear dynamical systems. 3) For the presented data-driven value iteration algorithm, theoretical analysis is discussed in terms of the positive-definiteness of the iterative value function, the closed-loop stability and the convergence to the optimal case.

PROBLEM FORMULATION
OFFLINE POLICY ITERATION ALGORITHM
EQUIVALENT INTEGRAL VI WITH DISCOUNT FACTOR
MAIN RESULTS
POSITIVE DEFINITENESS OF THE
STABILITY DISCUSSION
CONVERGENCE ANALYSIS
SIMULATION STUDY
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.