An Optimized Path Planning Method for Coastal Ships Based on Improved DDPG and DP

Yiquan Du,Jiacheng Liang,Zhiying Cao,Jiawei Tang,Xiuguo Zhang,Shaobo Wang,Fengge Zhang,Umar Iqbal

doi:10.1155/2021/7765130

Yiquan Du, Jiacheng Liang + Show 6 more

Open Access

https://doi.org/10.1155/2021/7765130

Copy DOI

Journal: Journal of Advanced Transportation	Publication Date: Oct 13, 2021
Citations: 12	License type: CC BY 4.0

Affiliation: Dalian Maritime University

Abstract

Deep Reinforcement Learning (DRL) is widely used in path planning with its powerful neural network fitting ability and learning ability. However, existing DRL-based methods use discrete action space and do not consider the impact of historical state information, resulting in the algorithm not being able to learn the optimal strategy to plan the path, and the planned path has arcs or too many corners, which does not meet the actual sailing requirements of the ship. In this paper, an optimized path planning method for coastal ships based on improved Deep Deterministic Policy Gradient (DDPG) and Douglas–Peucker (DP) algorithm is proposed. Firstly, Long Short-Term Memory (LSTM) is used to improve the network structure of DDPG, which uses the historical state information to approximate the current environmental state information, so that the predicted action is more accurate. On the other hand, the traditional reward function of DDPG may lead to low learning efficiency and convergence speed of the model. Hence, this paper improves the reward principle of traditional DDPG through the mainline reward function and auxiliary reward function, which not only helps to plan a better path for ship but also improves the convergence speed of the model. Secondly, aiming at the problem that too many turning points exist in the above-planned path which may increase the navigation risk, an improved DP algorithm is proposed to further optimize the planned path to make the final path more safe and economical. Finally, simulation experiments are carried out to verify the proposed method from the aspects of plan planning effect and convergence trend. Results show that the proposed method can plan safe and economic navigation paths and has good stability and convergence.

Highlights

With the development of economic globalization, trade between countries is getting closer
Ships sailing in open waters are not restricted by coastlines, but dynamic obstacles such as icebergs will appear. ere are proven obstacles in coastal waters, and there will be no dynamic obstacles and reefs like icebergs, but the shore-based information obstacles such as ship-wreck area, restricted navigation area, and military exercise area will appear. e path planning of coastal ships is mainly to avoid proven obstacles and shorebased information obstacles and plan a safe and effective path for ships [6, 7]. e problem of avoiding ships belongs to the field of collision avoidance, and there are special rules Journal of Advanced Transportation for collision avoidance [8], so this paper mainly avoids proven obstacles and shore-based information obstacles
This paper proposes a ship path planning model based on Long Short-Term Memory (LSTM) and Deep Deterministic Policy Gradient (DDPG). e model uses the historical state information to approximate the current environmental state information and constructs the mainline reward function and auxiliary reward function to optimize the action selection strategy of DDPG to guide the ship to avoid obstacles and reach the target point

Summary

Introduction

With the development of economic globalization, trade between countries is getting closer. Most DRL-based path planning methods use algorithms based on discrete action space, such as Q-learning and Deep Q-learning (DQN) [12,13,14,15]. Some scholars use DDPG or A3C algorithm to establish path planning models in continuous action space [16, 17]. (2) A ship path planning method based on the aboveimproved DDPG and reward function optimization is proposed. Aiming at the problems of low data utilization and slow convergence speed of most unmanned ship path planning based on deep reinforcement learning, this paper optimizes the traditional reward function and designs the mainline and auxiliary reward function.

Related Research

Ship Path Planning Based on Improved DDPG

Method of this paper

Path Optimization Based on Improved DP Algorithm

Experimental Verification and Result Analysis

Conclusions