Abstract

• A conditional generative adversarial networks (CGAN) based model-based RL is proposed. • The CGAN based model learning method can generate sufficient samples for policy learning. • The CGAN based model learning method does not require explicit expression of transition model. • The performance is improved and the sample efficiency is guaranteed. Deep reinforcement learning (DRL) integrates the advantages of the perception of deep learning and enables reinforcement learning scale to problems with high dimensional state and action spaces that were previously intractable. The success of DRL primarily relies on the high level representation ability of deep learning. To obtain a good performed representation model, excessive training samples and training time are necessary. However, collecting a large number of samples in real world is extremely expensive and time consuming. To mitigate the sample inefficiency problem, we propose a novel model-based reinforcement learning method by combining conditional generative adversarial networks (CGAN-MbRL) with the state-of-the-art policy learning method. The proposed CGAN-MbRL can directly deal with the high dimensional state, and mitigate the problem of sample inefficiency to some extent. Finally, the effectiveness of the proposed method is demonstrated through the illustrative data and the RL benchmark.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call