Reinforcement learning with guided policy search using Gaussian processes

Hunor S Jakab,Lehel Csato

doi:10.1109/ijcnn.2012.6252509

Abstract

Gradient based policy search algorithms benefit largely from the availability of a properly estimated state or state-action value function which can be used to reduce the variance of the gradient estimates. Additionally the use of Gaussian processes for value function approximation provides a fully probabilistic model where - using the uncertainty in the estimated value function - we can assess the amount of exploration required. In this article we present two modalities for adjusting different characteristics of the exploration in on-line learning of control policies for problems with continuous state-action spaces. The proposed methods exploit the fully probabilistic nature of the Gaussian processes and aims to constrain the exploration only to relevant subspaces, thereby speeding up convergence. We present experiments on a simulated control task to demonstrate the validity of our algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Reinforcement learning with guided policy search using Gaussian processes

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Experiments of conditioned reinforcement learning in continuous space control tasks
Borja Fernandez-Gauna ... Manuel Graña
Neurocomputing | VOL. 271
Borja Fernandez-Gauna, et. al.Borja Fernandez-Gauna ... Manuel Graña
06 Jul 2017
Neurocomputing | VOL. 271

Reinforcement learning in multidimensional continuous action spaces
Jason Pazis ... Michail G Lagoudakis
-
Jason Pazis, et. al.Jason Pazis ... Michail G Lagoudakis
01 Apr 2011
01 Apr 2011

Reinforcement learning with Gaussian processes for condition-based maintenance
Shenglin Peng ... Qianmei (May) Feng
Computers & Industrial Engineering | VOL. 158
Shenglin Peng, et. al.Shenglin Peng ... Qianmei (May) Feng
16 Apr 2021
Computers & Industrial Engineering | VOL. 158

Implementation of English “Online and Offline” Hybrid Teaching Recommendation Platform Based on Reinforcement Learning
Danling Dong ... Libo Wu
Security and Communication Networks | VOL. 2021
Danling Dong, et. al.Danling Dong ... Libo Wu
30 Sep 2021
Security and Communication Networks | VOL. 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Reinforcement learning with guided policy search using Gaussian processes

Abstract

Talk to us

Similar Papers