Abstract

This study proposed a reinforcement Q-learning-based deep neural network (RQDNN) that combined a deep principal component analysis network (DPCANet) and Q-learning to determine a playing strategy for video games. Video game images were used as the inputs. The proposed DPCANet was used to initialize the parameters of the convolution kernel and capture the image features automatically. It performs as a deep neural network and requires less computational complexity than traditional convolution neural networks. A reinforcement Q-learning method was used to implement a strategy for playing the video game. Both Flappy Bird and Atari Breakout games were implemented to verify the proposed method in this study. Experimental results showed that the scores of our proposed RQDNN were better than those of human players and other methods. In addition, the training time of the proposed RQDNN was also far less than other methods.

Highlights

  • Reinforcement learning was first used to play video games at the Mario AI Competition, which was hosted in 2009 by Institute of Electrical and Electronics Engineers (IEEE) Games Innovation Conference and IEEE Symposium on Computational Intelligence and Games [1]

  • In 2013, Mnih et al [2] proposed a convolution neural network based on the deep reinforcement learning algorithm, called Deep

  • Convolution neural networks divided the convolutional layer, pooling layer, convolution kernels gained from learning are significantly better than those gained from the function layer

Read more

Summary

Introduction

Mnih et al [3] proposed an improved DQN by adding a replay memory mechanism This method stores all learned states from a randomly selected number of empirical values from the experience data in each update. To evaluate the effectiveness of their method, a virtual maze using the Airsim software reinforcement learning to realize lateral control for autonomous driving in an open racing car platform was created. Yu et al [9] designed a results similar to human performance These studies show that training data are easier and less costly controller for a quadrotor. Q-learning-based network (RQDNN), autonomously and obtained results similar to human performance. These studies show that training was proposed to improve the above-mentioned shortcomings.

Overview oflearning
Pooling Layer
Χif ifXΧ
The Proposed
The Proposed RQDNN
Evaluation of Different Convolution Layers in DPCANet
The Flappy Bird Game
15. This figure showsinthat the training time of did was thantraining
15. Comparison results of RQDNN
The Atari Breakout Game
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.