Q-functionals for Value-Based Continuous Control

Samuel Lobel,Shangqun Yu,Sreehari Rammohan,Bowen He,George Konidaris

doi:10.1609/aaai.v37i7.26073

Abstract

We present Q-functionals, an alternative architecture for continuous control deep reinforcement learning. Instead of returning a single value for a state-action pair, our network transforms a state into a function that can be rapidly evaluated in parallel for many actions, allowing us to efficiently choose high-value actions through sampling. This contrasts with the typical architecture of off-policy continuous control, where a policy network is trained for the sole purpose of selecting actions from the Q-function. We represent our action-dependent Q-function as a weighted sum of basis functions (Fourier, Polynomial, etc) over the action space, where the weights are state-dependent and output by the Q-functional network. Fast sampling makes practical a variety of techniques that require Monte-Carlo integration over Q-functions, and enables action-selection strategies besides simple value-maximization. We characterize our framework, describe various implementations of Q-functionals, and demonstrate strong performance on a suite of continuous control tasks.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Q-functionals for Value-Based Continuous Control

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence	Publication Date: Jun 26, 2023
Citations: 1

Similar Papers

Deep reinforcement learning architectures for automatic organ segmentation
Valentin Ogrean ... Remus Brad
Biomedical signal processing and control | VOL. 90
Valentin Ogrean, et. al.Valentin Ogrean ... Remus Brad
02 Jan 2024
Biomedical signal processing and control | VOL. 90

Autoregressive Policies for Continuous Control Deep Reinforcement Learning
Dmytro Korenkevych ... James Bergstra
-
Dmytro Korenkevych, et. al.Dmytro Korenkevych ... James Bergstra
01 Aug 2019
01 Aug 2019

Continuous Control in Deep Reinforcement Learning with Direct Policy Derivation from Q Network
Aydar Akhmetzyanov ... Mikhail Ostanin
-
Aydar Akhmetzyanov, et. al.Aydar Akhmetzyanov ... Mikhail Ostanin
01 Jan 2020
01 Jan 2020

Deep Reinforcement Learning with a Natural Language Action Space
Ji He ... Lihong Li
-
Ji He, et. al.Ji He ... Lihong Li
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Q-functionals for Value-Based Continuous Control

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence