A near-optimal polynomial time algorithm for learning in certain classes of stochastic games

Ronen I Brafman,Moshe Tennenholtz

doi:10.1016/s0004-3702(00)00039-4

Abstract

We present a new algorithm for polynomial time learning of optimal behavior in single-controller stochastic games. This algorithm incorporates and integrates important recent results of Kearns and Singh (Proc. ICML-98, 1998) in reinforcement learning and of Monderer and Tennenholtz (J. Artif. Intell. Res. 7, 1997, p. 231) in repeated games. In stochastic games, the agent must cope with the existence of an adversary whose actions can be arbitrary. In particular, this adversary can withhold information about the game matrix by refraining from (or rarely) performing certain actions. This forces upon us an exploration versus exploitation dilemma more complex than in Markov decision processes in which, given information about particular parts of a game matrix, the agent must decide how much effort to invest in learning the unknown parts of the matrix. We present a polynomial time algorithm that addresses these issues in the context of the class of single controller stochastic games, providing the agent with near-optimal return.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Artificial Intelligence	Publication Date: Aug 1, 2000
Citations: 32	License type: elsevier-specific: oa user license

R Discovery Prime

R Discovery Prime

A near-optimal polynomial time algorithm for learning in certain classes of stochastic games

Abstract

Talk to us

Similar Papers

More From: Artificial Intelligence

Lead the way for us

Similar Papers

Remarks on sensitive equilibria in stochastic games with additive reward and transition structure
Andrzej S Nowak
Mathematical Methods of Operations Research | VOL. 64
Andrzej S NowakAndrzej S Nowak
10 Aug 2006
Mathematical Methods of Operations Research | VOL. 64

Quadratic programming and the single-controller stochastic game
Jerzy A Filar
Journal of Mathematical Analysis and Applications | VOL. 113
Jerzy A FilarJerzy A Filar
01 Jan 1986
Journal of Mathematical Analysis and Applications | VOL. 113

An Experimental Study of Different Approaches to Reinforcement Learning in Common Interest Stochastic Games
Avi Bab ... Ronen Brafman
-
Avi Bab, et. al.Avi Bab ... Ronen Brafman
01 Jan 2004
01 Jan 2004

Risk-Averse Designs: From Exponential Cost to Stochastic Games
Tamer Başar
-
Tamer BaşarTamer Başar
01 Jan 1999
01 Jan 1999

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A near-optimal polynomial time algorithm for learning in certain classes of stochastic games

Abstract

Talk to us

Similar Papers

More From: Artificial Intelligence