Model selection in reinforcement learning

Amir-Massoud Farahmand,Csaba Szepesvári

doi:10.1007/s10994-011-5254-7

Abstract

We consider the problem of model selection in the batch (offline, non-interactive) reinforcement learning setting when the goal is to find an action-value function with the smallest Bellman error among a countable set of candidates functions. We propose a complexity regularization-based model selection algorithm, $\ensuremath{\mbox{\textsc {BErMin}}}$ , and prove that it enjoys an oracle-like property: the estimator's error differs from that of an oracle, who selects the candidate with the minimum Bellman error, by only a constant factor and a small remainder term that vanishes at a parametric rate as the number of samples increases. As an application, we consider a problem when the true action-value function belongs to an unknown member of a nested sequence of function spaces. We show that under some additional technical conditions $\ensuremath{\mbox{\textsc {BErMin}}}$ leads to a procedure whose rate of convergence, up to a constant factor, matches that of an oracle who knows which of the nested function spaces the true action-value function belongs to, i.e., the procedure achieves adaptivity.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Model selection in reinforcement learning

Abstract

Talk to us

Similar Papers

More From: Machine Learning

Lead the way for us

Journal: Machine Learning	Publication Date: Jun 11, 2011
Citations: 71

Similar Papers

Joint Bayesian model selection and blind equalization of ISI channels
Zaifei Liu ... A Doucet
-
Zaifei Liu, et. al. Zaifei Liu ... A Doucet
28 Aug 2005
28 Aug 2005

Regularization in reinforcement learning
...
-
, et. al. ...
01 Jan 2010
01 Jan 2010

Statistical inference problems with applications to computational structural biology

-

02 Mar 2017
02 Mar 2017

Learning varying dimension radial basis functions for deformable image alignment
Di Yang ... Hongdong Li
-
Di Yang, et. al.Di Yang ... Hongdong Li
01 Sep 2009
01 Sep 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Model selection in reinforcement learning

Abstract

Talk to us

Similar Papers

More From: Machine Learning