Context transfer in reinforcement learning using action-value functions.

Amin Mousavi,Babak Nadjar Araabi,Majid Nili Ahmadabadi

doi:10.1155/2014/428567

Abstract

This paper discusses the notion of context transfer in reinforcement learning tasks. Context transfer, as defined in this paper, implies knowledge transfer between source and target tasks that share the same environment dynamics and reward function but have different states or action spaces. In other words, the agents learn the same task while using different sensors and actuators. This requires the existence of an underlying common Markov decision process (MDP) to which all the agents' MDPs can be mapped. This is formulated in terms of the notion of MDP homomorphism. The learning framework is Q-learning. To transfer the knowledge between these tasks, the feature space is used as a translator and is expressed as a partial mapping between the state-action spaces of different tasks. The Q-values learned during the learning process of the source tasks are mapped to the sets of Q-values for the target task. These transferred Q-values are merged together and used to initialize the learning process of the target task. An interval-based approach is used to represent and merge the knowledge of the source tasks. Empirical results show that the transferred initialization can be beneficial to the learning process of the target task.

Highlights

The notion of transfer learning is a challenging area in the field of reinforcement learning (RL) [1,2,3]
A dynamics transfer problem is a problem in which agents share the same context and the same reward function but have different transition models
Most of the current transfer learning approaches in RL are typically framed as leveraging knowledge learned on a source task to improve learning on a related, but different, target task

Summary

Introduction

The notion of transfer learning is a challenging area in the field of reinforcement learning (RL) [1,2,3]. A goal transfer problem is a problem in which agents share the same context (i.e., state and action spaces) and the same transition model but have different reward functions. A dynamics transfer problem is a problem in which agents share the same context and the same reward function but have different transition models. In the case of domain transfer, the agents may have different dynamics, goals, and state-action spaces. This is the most general and complex problem of transfer. The problem of knowledge transfer between such agents is called context transfer This is formulated and discussed using the notion of Markov decision process (MDP) homomorphism [5, 6].

Context Transfer Problem

Why Context Transfer Is Important

Feature Space as a Translator between Tasks

Knowledge Fusion and Transfer

Q-Intervals for Knowledge Fusion

Case Studies and Results

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computational Intelligence and Neuroscience	Publication Date: Jan 1, 2014
Citations: 18	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

Context transfer in reinforcement learning using action-value functions.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational Intelligence and Neuroscience

Lead the way for us

Similar Papers

A taxonomy for similarity metrics between Markov decision processes
Javier García ... Fernando Fernández
Machine Learning | VOL. 111
Javier García, et. al.Javier García ... Fernando Fernández
14 Oct 2022
Machine Learning | VOL. 111

Qualitative Transfer for Reinforcement Learning with Continuous State and Action Spaces
Esteban O Garcia ... Enrique Munoz De Cote
-
Esteban O Garcia, et. al.Esteban O Garcia ... Enrique Munoz De Cote
01 Jan 2013
01 Jan 2013

Probabilistic Multi-knowledge Transfer in Reinforcement Learning
Daniel Fernandez ... Javier Garcia
-
Daniel Fernandez, et. al.Daniel Fernandez ... Javier Garcia
01 Dec 2021
01 Dec 2021

Transfer of samples in batch reinforcement learning
Alessandro Lazaric ... Marcello Restelli
-
Alessandro Lazaric, et. al.Alessandro Lazaric ... Marcello Restelli
01 Jan 2008
01 Jan 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Context transfer in reinforcement learning using action-value functions.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational Intelligence and Neuroscience