Abstract

In this and the next chapter we present an application of the learning algorithms developed in the previous chapters to two person zero sum games: Let A and B be the two players. Both are allowed to use mixed strategies. At any instant each player picks a pure strategy as a sample realization from his mixed strategy. As a result of their joint action they receive a random outcome which is either a success or failure. Since the game is a zero-sum game A’s success is B’s failure and vice-versa. The following assumptions are fundamental to our analysis: Either player has no knowledge of the set of pure strategies available to the other player or the pure strategy actually chosen by the other player at any stage of the game or the distribution of the random outcome as a function of the pure strategies chosen by them. Just based on the pure strategy chosen by him and the random outcome he receives both the players individually update their mixed strategies using a learning algorithm. This cycle continues and thus the game is played sequentially. In short we consider a zero-sum game between two players in which the players are totally decentralized, there is no communication or transfer of information between them either before or during the course of the play of the game and in fact they may not even know that they are involved in a game situation at all. In this set-up our aim is to find conditions on the learning algorithms such that both the players in the long run will receive an expected payoff as close to the well established game theoretic solutions (Von Neumann value) as desired.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call