Abstract

The popular deepQlearning algorithm is known to be instability because of theQ-value’s shake and overestimation action values under certain conditions. These issues tend to adversely affect their performance. In this paper, we develop the ensemble network architecture for deep reinforcement learning which is based on value function approximation. The temporal ensemble stabilizes the training process by reducing the variance of target approximation error and the ensemble of target values reduces the overestimate and makes better performance by estimating more accurateQ-value. Our results show that this architecture leads to statistically significant better value evaluation and more stable and better performance on several classical control tasks at OpenAI Gym environment.

Highlights

  • Reinforcement learning (RL) algorithms [1, 2] are very suitable for learning to control an agent by letting it interact with an environment

  • We develop the ensemble network architecture for deep reinforcement learning which is based on value function approximation

  • Deep neural networks (DNN) have been introduced into reinforcement learning, and they have achieved a great success on the value function approximation

Read more

Summary

Introduction

Reinforcement learning (RL) algorithms [1, 2] are very suitable for learning to control an agent by letting it interact with an environment. Deep neural networks (DNN) have been introduced into reinforcement learning, and they have achieved a great success on the value function approximation. The first deep Q-network (DQN) algorithm which successfully combines a powerful nonlinear function approximation technique known as DNN together with the Q-learning algorithm was proposed by Mnih et al [3]. Following the DQN work, a variety of solutions have been proposed to stabilize the algorithms [3,4,5,6,7,8,9]. The deep Q-networks classes have achieved unprecedented success in challenging domains such as Atari 2600 and some other games

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call