&lt;strong&gt;Efficient RL Algorithm by Combing AC with Dual Piecewise Model Learning&lt;/strong&gt;

Quan Liu,Shan Zhong,Qiming Fu

doi:10.3390/mol2net-02-03895

<strong>Efficient RL Algorithm by Combing AC with Dual Piecewise Model Learning</strong>

Quan Liu, Shan Zhong + Show 1 more

Open Access

https://doi.org/10.3390/mol2net-02-03895

Copy DOI

Abstract

As classic methods for handling continuous action space problem for continuous action space problem in RL, the actor-critic (AC) algorithm and its variants still fail to be sample efficiency. Therefore, we propose a method based on learning two linear models for planning. The two linear models refers to state-based piecewise model and action-based piecewise model, which are determined by the divisions for the state and action space, respectively. Through division, the models are learned more accurately. To accelerate the convergence, the sample near the goal is saved and used to learn the model, the value and the policy to balance the distribution of the samples. On two classic RL benchmarks with continuous MDPs, the proposed method shows the ability of learning an optimal policy by combing both models, and it also outperforms the representative methods in terms of convergence rate and sample efficiency.

Full Text