Abstract

This paper first presents an overall view for dynamical decision-making in teams, both cooperative and competitive. Strategies for team decision problems, including optimal control, zero-sum 2-player games (H-infinity control) and so on are normally solved for off-line by solving associated matrix equations such as the Riccati equation. However, using that approach, players cannot change their objectives online in real time without calling for a completely new off-line solution for the new strategies. Therefore, in this paper we give a method for learning optimal team strategies online in real time as team dynamical play unfolds. In the linear quadratic regulator case, for instance, the method learns the Riccati equation solution online without ever solving the Riccati equation. This allows for truly dynamical team decisions where objective functions can change in real time and the system dynamics can be time-varying.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call