Gamma-Nets: Generalizing Value Estimation over Timescale

Craig Sherstan,Johannes Günther,Shibhansh Dohare,James Macglashan,Patrick M Pilarski

doi:10.1609/aaai.v34i04.6027

Abstract

Temporal abstraction is a key requirement for agents making decisions over long time horizons—a fundamental challenge in reinforcement learning. There are many reasons why value estimates at multiple timescales might be useful; recent work has shown that value estimates at different time scales can be the basis for creating more advanced discounting functions and for driving representation learning. Further, predictions at many different timescales serve to broaden an agent's model of its environment. One predictive approach of interest within an online learning setting is general value function (GVFs), which represent models of an agent's world as a collection of predictive questions each defined by a policy, a signal to be predicted, and a prediction timescale. In this paper we present Γ-nets, a method for generalizing value function estimation over timescale, allowing a given GVF to be trained and queried for arbitrary timescales so as to greatly increase the predictive ability and scalability of a GVF-based model. The key to our approach is to use timescale as one of the value estimator's inputs. As a result, the prediction target for any timescale is available at every timestep and we are free to train on any number of timescales. We first provide two demonstrations by 1) predicting a square wave and 2) predicting sensorimotor signals on a robot arm using a linear function approximator. Next, we empirically evaluate Γ-nets in the deep reinforcement learning setting using policy evaluation on a set of Atari video games. Our results show that Γ-nets can be effective for predicting arbitrary timescales, with only a small cost in accuracy as compared to learning estimators for fixed timescales. Γ-nets provide a method for accurately and compactly making predictions at many timescales without requiring a priori knowledge of the task, making it a valuable contribution to ongoing work on model-based planning, representation learning, and lifelong learning algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Gamma-Nets: Generalizing Value Estimation over Timescale

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Apr 3, 2020
Citations: 7

Similar Papers

Addressing drought conditions under current and future climates in the Jordan River region
T Törnros ... L Menzel
Hydrology and Earth System Sciences | VOL. 18
T Törnros, et. al.T Törnros ... L Menzel
23 Jan 2014
Hydrology and Earth System Sciences | VOL. 18

Markovian approach for modeling IP traffic behavior on several time scales
Antonio D Nogueira ... Paulo F Salvador
-
Antonio D Nogueira, et. al.Antonio D Nogueira ... Paulo F Salvador
06 Aug 2003
06 Aug 2003

Modeling IP Traffic Behavior through Markovian Models
Antóniol Nogueira ... António Pacheco
-
Antóniol Nogueira, et. al.Antóniol Nogueira ... António Pacheco
01 Jan 2008
01 Jan 2008

Intelligent laser welding through representation, prediction, and control learning: An architecture with deep neural networks and reinforcement learning
Johannes Günther ... Klaus Diepold
Mechatronics | VOL. 34
Johannes Günther, et. al.Johannes Günther ... Klaus Diepold
01 Oct 2015
Mechatronics | VOL. 34

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Gamma-Nets: Generalizing Value Estimation over Timescale

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence