Latent Structure Matching for Knowledge Transfer in Reinforcement Learning

Yi Zhou,Fenglei Yang

doi:10.3390/fi12020036

Abstract

Reinforcement learning algorithms usually require a large number of empirical samples and give rise to a slow convergence in practical applications. One solution is to introduce transfer learning: Knowledge from well-learned source tasks can be reused to reduce sample request and accelerate the learning of target tasks. However, if an unmatched source task is selected, it will slow down or even disrupt the learning procedure. Therefore, it is very important for knowledge transfer to select appropriate source tasks that have a high degree of matching with target tasks. In this paper, a novel task matching algorithm is proposed to derive the latent structures of value functions of tasks, and align the structures for similarity estimation. Through the latent structure matching, the highly-matched source tasks are selected effectively, from which knowledge is then transferred to give action advice, and improve exploration strategies of the target tasks. Experiments are conducted on the simulated navigation environment and the mountain car environment. The results illustrate the significant performance gain of the improved exploration strategy, compared with traditional ϵ -greedy exploration strategy. A theoretical proof is also given to verify the improvement of the exploration strategy based on latent structure matching.

Highlights

Reinforcement learning (RL) is where an agent guides its actions based on the rewards obtained from the trial-and-error interaction with the environment [1,2]
In RL, the knowledge obtained from previous situations can be reused as heuristics to achieve effective knowledge transfer, speeding up the learning procedure in new situations and reducing sample request [3]; knowledge transfer is able to mitigate much the issue caused by a change on the problem configuration as mentioned above
(ii) Based on latent structure matching (LSM), we present an improved exploration strategy, that is built on the knowledge obtained from the highly-matched source task

Summary

Introduction

Reinforcement learning (RL) is where an agent guides its actions based on the rewards obtained from the trial-and-error interaction with the environment [1,2]. The similarity estimation of tasks is the main way to select matched source tasks in the existing works on knowledge transfer for RL. Clustering algorithms [22] were used in some works to tackle large number of tasks In these works, the clustering of policies, value functions, rewards, and dynamics of tasks, were modeled as random process to estimate the similarity [23,24]. (ii) Based on LSM, we present an improved exploration strategy, that is built on the knowledge obtained from the highly-matched source task. This improved strategy reduces random exploration in value function space of tasks, effectively improving the performance of RL agents.

Knowledge Transfer in RL

Low Rank Embedding

Method

Value Function Transfer

Experiments

Experiments on Maze Navigation Problem

LSM-based exploration ǫ-greedy exploration

Experiments on Mountain Car Problem

Findings

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Future Internet	Publication Date: Feb 13, 2020
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Latent Structure Matching for Knowledge Transfer in Reinforcement Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Future Internet

Lead the way for us

Similar Papers

A taxonomy for similarity metrics between Markov decision processes
Álvaro Visús ... Javier García
Machine Learning | VOL. 111
Álvaro Visús, et. al.Álvaro Visús ... Javier García
14 Oct 2022
Machine Learning | VOL. 111

Probabilistic Multi-knowledge Transfer in Reinforcement Learning
Fernando Fernandez ... Javier Garcia
-
Fernando Fernandez, et. al.Fernando Fernandez ... Javier Garcia
01 Dec 2021
01 Dec 2021

Transfer of samples in batch reinforcement learning
Marcello Restelli ... Andrea Bonarini
-
Marcello Restelli, et. al.Marcello Restelli ... Andrea Bonarini
01 Jan 2008
01 Jan 2008

Domain- and task-specific transfer learning for medical segmentation tasks
Efstratios Gavves ... Riaan Zoetmulder
Computer Methods and Programs in Biomedicine | VOL. 214
Efstratios Gavves, et. al.Efstratios Gavves ... Riaan Zoetmulder
23 Nov 2021
Computer Methods and Programs in Biomedicine | VOL. 214

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Latent Structure Matching for Knowledge Transfer in Reinforcement Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Future Internet