Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

Sean R Sinclair,Siddhartha Banerjee,Christina Lee Yu

doi:10.1145/3366703

Abstract

We present an efficient algorithm for model-free episodic reinforcement learning on large (potentially continuous) state-action spaces. Our algorithm is based on a novel Q-learning policy with adaptive data-driven discretization. The central idea is to maintain a finer partition of the state-action space in regions which are frequently visited in historical trajectories, and have higher payoff estimates. We demonstrate how our adaptive partitions take advantage of the shape of the optimal Q-function and the joint space, without sacrificing the worst-case performance. In particular, we recover the regret guarantees of prior algorithms for continuous state-action spaces, which additionally require either an optimal discretization as input, and/or access to a simulation oracle. Moreover, experiments demonstrate how our algorithm automatically adapts to the underlying structure of the problem, resulting in much better performance compared both to heuristics and Q-learning with uniform discretization.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ACM on Measurement and Analysis of Computing Systems

Lead the way for us

Journal: Proceedings of the ACM on Measurement and Analysis of Computing Systems	Publication Date: Dec 17, 2019
Citations: 12

Similar Papers

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces
Sean R Sinclair ... Christina Lee Yu
-
Sean R Sinclair, et. al.Sean R Sinclair ... Christina Lee Yu
08 Jun 2020
08 Jun 2020

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces
Sean R Sinclair ... Siddhartha Banerjee
ACM SIGMETRICS Performance Evaluation Review | VOL. 48
Sean R Sinclair, et. al.Sean R Sinclair ... Siddhartha Banerjee
08 Jul 2020
ACM SIGMETRICS Performance Evaluation Review | VOL. 48

Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods
Xin Guo ... Junzi Zhang
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 36
Xin Guo, et. al.Xin Guo ... Junzi Zhang
28 Jun 2022
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 36

Adaptive Discretization in Online Reinforcement Learning
Sean R Sinclair ... Siddhartha Banerjee
Operations Research | VOL. 71
Sean R Sinclair, et. al.Sean R Sinclair ... Siddhartha Banerjee
11 Nov 2022
Operations Research | VOL. 71

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ACM on Measurement and Analysis of Computing Systems