The asymptotic equipartition property in reinforcement learning and its relation to return maximization

Kazunori Iwata,Kazushi Ikeda,Hideaki Sakai

doi:10.1016/j.neunet.2005.02.008

Abstract

We discuss an important property called the asymptotic equipartition property on empirical sequences in reinforcement learning. This states that the typical set of empirical sequences has probability nearly one, that all elements in the typical set are nearly equi-probable, and that the number of elements in the typical set is an exponential function of the sum of conditional entropies if the number of time steps is sufficiently large. The sum is referred to as stochastic complexity. Using the property we elucidate the fact that the return maximization depends on two factors, the stochastic complexity and a quantity depending on the parameters of environment. Here, the return maximization means that the best sequences in terms of expected return have probability one. We also examine the sensitivity of stochastic complexity, which is a qualitative guide in tuning the parameters of action-selection strategy, and show a sufficient condition for return maximization in probability.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

The asymptotic equipartition property in reinforcement learning and its relation to return maximization

Abstract

Talk to us

Similar Papers

More From: Neural networks : the official journal of the International Neural Network Society

Lead the way for us

Journal: Neural networks : the official journal of the International Neural Network Society	Publication Date: Oct 3, 2005
Citations: 20

Similar Papers

Elements of Information Theory
Thomas M Cover ... Joy A Thomas
-
Thomas M Cover, et. al.Thomas M Cover ... Joy A Thomas
01 Jan 1991
01 Jan 1991

An information-theoretic analysis of return maximization in reinforcement learning
Kazunori Iwata
Neural networks : the official journal of the International Neural Network Society | VOL. 24
Kazunori IwataKazunori Iwata
17 May 2011
Neural networks : the official journal of the International Neural Network Society | VOL. 24

Stochastic Processes for Return Maximization in Reinforcement Learning
Kazunori Iwata ... Hideaki Sakai
-
Kazunori Iwata, et. al.Kazunori Iwata ... Hideaki Sakai
01 Jan 2004
01 Jan 2004

Hyperspectral Images Classification with Typical Sequences associated to the Endmember
Samir Youssif Wehbi Arabi ... David Fernandes
IEEE Latin America Transactions | VOL. 14
Samir Youssif Wehbi Arabi, et. al.Samir Youssif Wehbi Arabi ... David Fernandes
01 Jul 2016
IEEE Latin America Transactions | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The asymptotic equipartition property in reinforcement learning and its relation to return maximization

Abstract

Talk to us

Similar Papers

More From: Neural networks : the official journal of the International Neural Network Society