Abstract

A sequential stochastic game among an arbitrary number of players in which all players' payoffs are identical is analyzed. The players are unaware that they are in a game and hence they have no knowledge of other players' strategies or the payoff structure. At each instant the players use a simple learning algorithm to update their mixed strategy choices based entirely on the response of a random environment. It is shown that the expected change in each player's payoff is nonnegative at every instant, so that the group improves its performance monotonically. This result appears to have important implications in decentralized decision-making in large complex systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call