An Improved Dueling Deep Double-Q Network Based on Prioritized Experience Replay for Path Planning of Unmanned Surface Vehicles

Zhengwei Zhu,Yanping Zhu,Can Hu,Yu Sheng,Chenyang Zhu

doi:10.3390/jmse9111267

Zhengwei Zhu, Yanping Zhu + Show 3 more

Open Access

https://doi.org/10.3390/jmse9111267

Copy DOI

Journal: Journal of marine science and engineering	Publication Date: Nov 13, 2021
Citations: 15	License type: CC BY 4.0

Affiliation: Changzhou University

Abstract

Unmanned Surface Vehicle (USV) has a broad application prospect and autonomous path planning as its crucial technology has developed into a hot research direction in the field of USV research. This paper proposes an Improved Dueling Deep Double-Q Network Based on Prioritized Experience Replay (IPD3QN) to address the slow and unstable convergence of traditional Deep Q Network (DQN) algorithms in autonomous path planning of USV. Firstly, we use the deep double Q-Network to decouple the selection and calculation of the target Q value action to eliminate overestimation. The prioritized experience replay method is adopted to extract experience samples from the experience replay unit, increase the utilization rate of actual samples, and accelerate the training speed of the neural network. Then, the neural network is optimized by introducing a dueling network structure. Finally, the soft update method is used to improve the stability of the algorithm, and the dynamic ϵ-greedy method is used to find the optimal strategy. The experiments are first conducted in the Open AI Gym test platform to pre-validate the algorithm for two classical control problems: the Cart pole and Mountain Car problems. The impact of algorithm hyperparameters on the model performance is analyzed in detail. The algorithm is then validated in the Maze environment. The comparative analysis of simulation experiments shows that IPD3QN has a significant improvement in learning performance regarding convergence speed and convergence stability compared with DQN, D3QN, PD2QN, PDQN, PD3QN. Also, USV can plan the optimal path according to the actual navigation environment with the IPD3QN algorithm.

Highlights

To address poor stability and slow convergence of the Deep Q Network (DQN) algorithm in path planning problems, this paper proposes an Improved Dueling Double Deep Q-Network Based on Prioritized Experience Replay (IPD3QN)
Compared with other comparison algorithms, using the algorithm proposed in this article, Unmanned Surface Vehicle (USV) can plan the optimal path faster according to the actual navigation environment
Converges faster than the rest; the data in Table 7 shows that in the maze environment, the average reward of IPD3QN is greater than Other algorithms, and the standard deviation compared with DQN, D3QN, PD2QN, PDQN, PD3QN reduced by 59.6%, 53.1%, 54.3%, 61.1%, 46.2%, performance is more stable

Summary

Introduction with regard to jurisdictional claims in

As the global population and economy continue to rise and the energy available on land becomes less exploitable, countries around the world are turning their attention to the oceans, which account for approximately two-thirds of the planet [1]. Reinforcement learning does not require prior knowledge of complex environment models, which helps achieve a high level of human intelligence and becomes an attractive approach for path planning [2,3], unmanned driving [4], video games [5], robot control [6] and USV path planning [7]. Traditional Q-learning algorithms [8] have better results for path planning, they still have slow convergence speed and cannot solve the real-world problems of large scale and high complexity [9]. To address poor stability and slow convergence of the DQN algorithm in path planning problems, this paper proposes an Improved Dueling Double Deep Q-Network Based on Prioritized Experience Replay (IPD3QN). Compared with other comparison algorithms, using the algorithm proposed in this article, USV can plan the optimal path faster according to the actual navigation environment

Reinforcement Learning

DEEP Q-Networks

Double Deep Q-Networks

Dueling Deep Q-Networks

Prioritized Experience Replay Deep Q-Networks

Convergence Rate and Convergence Stability

Soft Update of the Target Network

Dynamic ε-Greedy Index Decline Method

Algorithm Description

Environment Describe

Result Analysis

Findings

Conclusions and Future Work

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Improved Dueling Deep Double-Q Network Based on Prioritized Experience Replay for Path Planning of Unmanned Surface Vehicles

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of marine science and engineering

Lead the way for us

Similar Papers

Path Planning of Unmanned Surface Vehicle Port Docking Based on Improved Double Deep Q-Network
Xu Hou ... Meng Joo Er
-
Xu Hou, et. al.Xu Hou ... Meng Joo Er
23 Sep 2022
23 Sep 2022

Deep Reinforcement Learning based Path Planning for Mobile Robot in Unknown Environment
Yang Wang ... Junwei Yan
Journal of Physics: Conference Series | VOL. 1576
Yang Wang, et. al.Yang Wang ... Junwei Yan
01 Jun 2020
Journal of Physics: Conference Series | VOL. 1576

Self-Paced Prioritized Curriculum Learning With Coverage Penalty in Deep Reinforcement Learning.
Zhipeng Ren ... Chunlin Chen
IEEE transactions on neural networks | VOL. 29
Zhipeng Ren, et. al.Zhipeng Ren ... Chunlin Chen
01 Jun 2018
IEEE transactions on neural networks | VOL. 29

Path Planning Algorithm for Unmanned Surface Vessel Based on Multiobjective Reinforcement Learning.
Caipei Yang ... Abdul Rehman Javed
Computational Intelligence and Neuroscience | VOL. 2023
Caipei Yang, et. al.Caipei Yang ... Abdul Rehman Javed
15 Feb 2023
Computational Intelligence and Neuroscience | VOL. 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Improved Dueling Deep Double-Q Network Based on Prioritized Experience Replay for Path Planning of Unmanned Surface Vehicles

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of marine science and engineering