Abstract

This paper addresses the problem of finding optimal output feedback strategies for solving linear differential zero-sum games using a model-free approach based on adaptive dynamic programming (ADP). In contrast to their discrete-time counterparts, differential games involve continuous-time dynamics and existing ADP approaches to their solutions require access to full measurement of the internal state. This difficulty is due to the fact that direct translation of the discrete-time output feedback ADP results requires derivatives of the input and output measurements, which is generally prohibitive in practice. This work aims to overcome this difficulty and presents a new embedded filtering based observer approach towards designing output feedback ADP algorithms for solving the differential zero-sum game problem. Two output feedback ADP algorithms based respectively on policy iteration and value iteration are developed. The proposed scheme is completely online in nature and works without requiring information of the system dynamics. In addition, this work also addresses the excitation bias problem encountered in output feedback ADP methods, which typically requires a discounting factor for its mitigation. We show that the proposed scheme is bias-free, and therefore, does not require a discounting factor. It is shown that the proposed algorithms converge to the solution obtained by solving the game algebraic Riccati equation. Two numerical examples are demonstrated to validate the proposed scheme.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call