TEXPLORE: real-time sample-efficient reinforcement learning for robots

Todd Hester,Peter Stone

doi:10.1007/s10994-012-5322-7

Abstract

The use of robots in society could be expanded by using reinforcement learning (RL) to allow robots to learn and adapt to new situations online. RL is a paradigm for learning sequential decision making tasks, usually formulated as a Markov Decision Process (MDP). For an RL algorithm to be practical for robotic control tasks, it must learn in very few samples, while continually taking actions in real-time. In addition, the algorithm must learn efficiently in the face of noise, sensor/actuator delays, and continuous state features. In this article, we present texplore, the first algorithm to address all of these challenges together. texplore is a model-based RL method that learns a random forest model of the domain which generalizes dynamics to unseen states. The agent explores states that are promising for the final policy, while ignoring states that do not appear promising. With sample-based planning and a novel parallel architecture, texplore can select actions continually in real-time whenever necessary. We empirically evaluate the importance of each component of texplore in isolation and then demonstrate the complete algorithm learning to control the velocity of an autonomous vehicle in real-time.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

TEXPLORE: real-time sample-efficient reinforcement learning for robots

Abstract

Talk to us

Similar Papers

More From: Machine Learning

Lead the way for us

Journal: Machine Learning	Publication Date: Oct 24, 2012
Citations: 111

Similar Papers

The Open-Source TEXPLORE Code Release for Reinforcement Learning on Robots
Todd Hester ... Peter Stone
-
Todd Hester, et. al.Todd Hester ... Peter Stone
01 Jan 2014
01 Jan 2014

Safe Model-Based Reinforcement Learning for Systems With Parametric Uncertainties.
S M Nahid Mahmud ... Rushikesh Kamalapurkar
Frontiers in Robotics and AI | VOL. 8
S M Nahid Mahmud, et. al.S M Nahid Mahmud ... Rushikesh Kamalapurkar
16 Dec 2021
Frontiers in Robotics and AI | VOL. 8

An MDP Approach for Defending Against Fraud Attack in Cognitive Radio Networks
Hadi Shahriar Shahhoseini ... Khadijeh Afhamisisi
IETE Journal of Research | VOL. 61
Hadi Shahriar Shahhoseini, et. al.Hadi Shahriar Shahhoseini ... Khadijeh Afhamisisi
01 Apr 2015
IETE Journal of Research | VOL. 61

DataSheet1.pdf
-
-
--
16 Dec 2021
16 Dec 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

TEXPLORE: real-time sample-efficient reinforcement learning for robots

Abstract

Talk to us

Similar Papers

More From: Machine Learning