Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations

Nathan Beck ,Abhiramon Rajasekharan ,Hieu Tran

doi:10.48550/arxiv.2202.02442

Abstract

Transfer learning approaches in reinforcement learning aim to assist agents in learning their target domains by leveraging the knowledge learned from other agents that have been trained on similar source domains. For example, recent research focus within this space has been placed on knowledge transfer between tasks that have different transition dynamics and reward functions; however, little focus has been placed on knowledge transfer between tasks that have different action spaces. In this paper, we approach the task of transfer learning between domains that differ in action spaces. We present a reward shaping method based on source embedding similarity that is applicable to domains with both discrete and continuous action spaces. The efficacy of our approach is evaluated on transfer to restricted action spaces in the Acrobot-v1 and Pendulum-v0 domains. A comparison with two baselines shows that our method does not outperform these baselines in these continuous action spaces but does show an improvement in these discrete action spaces. We conclude our analysis with future directions for this work.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations

Abstract

Talk to us

Similar Papers

More From: arXiv (Cornell University)

Lead the way for us

Journal: arXiv (Cornell University)	Publication Date: Feb 4, 2022
Citations: 1

Similar Papers

Action decoupled SAC reinforcement learning with discrete-continuous hybrid action spaces
Yahao Xu ... Hongbin Deng
Neurocomputing | VOL. 537
Yahao Xu, et. al.Yahao Xu ... Hongbin Deng
31 Mar 2023
Neurocomputing | VOL. 537

A comparative study of 13 deep reinforcement learning based energy management methods for a hybrid electric vehicle
Hanchen Wang ... Bin Xu
Energy | VOL. 266
Hanchen Wang, et. al.Hanchen Wang ... Bin Xu
20 Dec 2022
Energy | VOL. 266

Deterministic Policy Optimization by Combining Pathwise and Score Function Estimators for Discrete Action Spaces
Daniel Levy ... Stefano Ermon
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 32
Daniel Levy, et. al.Daniel Levy ... Stefano Ermon
29 Apr 2018
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 32

Using Deep Deterministic Policy Gradient (DDPG) to Train a Double-Jointed Arm to Reach Target Locations in the Unity ML-Agents Reacher Environment
Oluwaseyi Awoga
SSRN Electronic Journal | VOL. -
Oluwaseyi AwogaOluwaseyi Awoga
12 Jul 2021
SSRN Electronic Journal | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations

Abstract

Talk to us

Similar Papers

More From: arXiv (Cornell University)