Learning Similar Tasks Based On PPO By Transferring Trajectory

An Guo,Lianghua Song,Xiong Chen

doi:10.1109/icnsc.2019.8743298

Learning Similar Tasks Based On PPO By Transferring Trajectory

An Guo, Lianghua Song + Show 1 more

https://doi.org/10.1109/icnsc.2019.8743298

Copy DOI

Publication Date: May 1, 2019

Citations: 10

Affiliation: Fudan University

#Proximal Policy Optimization #Deep Reinforcement Learning + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Deep reinforcement learning (DRL) has many achievements in playing games and robotic control. However, transfer learning, a normal way of training a deep neural network for computer vision tasks, cannot be directly applied to DRL. This is an issue in the field of DRL. Many methods are proposed to address this problem. In this paper, a new method based on the proximal policy optimization (PPO) algorithm for learning similar tasks is proposed. The method uses a function to transform the trajectory from the previous task to the current task, both of those are similar. As shown in the result, our method can bring a more stable performance to our model at the early stage of training with compare to the model trained by using PPO without our method.

Full Text