Path Planning for UAV Ground Target Tracking via Deep Reinforcement Learning

Bohao Li,Yunjie Wu

doi:10.1109/access.2020.2971780

Bohao Li, Yunjie Wu

Open Access

https://doi.org/10.1109/access.2020.2971780

Copy DOI

Abstract

In this paper, we focus on the study of UAV ground target tracking under obstacle environments using deep reinforcement learning, and an improved deep deterministic policy gradient (DDPG) algorithm is presented. A reward function based on line of sight and artificial potential field is constructed to guide the behavior of UAV to achieve target tracking, and a penalty term of action makes the trajectory smooth. In order to improve the exploration ability, multiple UAVs, which controlled by the same policy network, are used to perform tasks in each episode. Taking into account that the history observations have a great degree of correlation with the policy, long short-term memory networks are used to approximate the state of environments, which improve the approximation accuracy and the efficiency of data utilization. The simulation results show that the propose method can make the UAV keep target tracking and obstacle avoidance effectively.

Highlights

Unmanned aerial vehicles (UAVs) have the advantages of safety, low cost and high maneuverability
The improved Deep deterministic policy gradient (DDPG) algorithm is trained in a virtual simulation environment, and the well-trained algorithm can be used for online target tracking and obstacle avoidance in new dynamic environments
BACKGROUND we give an introduction to the background knowledge of DRL ground target tracking and obstacle avoidance, including a DRL algorithm – DDPG and the environment used for tracking

Summary

INTRODUCTION

Unmanned aerial vehicles (UAVs) have the advantages of safety, low cost and high maneuverability. High autonomy on-line trajectory planning of UAV for target tracking and obstacle avoidance in unknown working environment arosed great attentions [2]–[4]. Deep deterministic policy gradient (DDPG) [27] is a DRL algorithm which combines DQN with actor-critic and can be operated in continuous action space. Dynamic and partially observable environments are major challenges for UAV target tracking [26] To overcome these difficulties, we improve DDPG in terms of reward function and data. The improved DDPG algorithm is trained in a virtual simulation environment, and the well-trained algorithm can be used for online target tracking and obstacle avoidance in new dynamic environments. Including the DDPG algorithm, the ground target tracking environment, the kinetic models of UAV, observation space and action space.

BACKGROUND

ENVIRONMENTS

OBSERVATION AND ACTION SPACE

REWARD FUNCTION

EXPERIMENTS

EXPERIMENT RESULT

Findings

CONCLUSION AND PROSPECT

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE access : practical innovations, open solutions	Publication Date: Jan 1, 2020
Citations: 119	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Path Planning for UAV Ground Target Tracking via Deep Reinforcement Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions

Lead the way for us

Similar Papers

A State-Compensated Deep Deterministic Policy Gradient Algorithm for UAV Trajectory Tracking
Jiying Wu ... Luwei Liao
Machines | VOL. 10
Jiying Wu, et. al.Jiying Wu ... Luwei Liao
21 Jun 2022
Machines | VOL. 10

UAV maneuvering decision -making algorithm based on Twin Delayed Deep Deterministic Policy Gradient Algorithm
Shuangxia Bai ... Shaomei Song
Journal of Artificial Intelligence and Technology | VOL. -
Shuangxia Bai, et. al.Shuangxia Bai ... Shaomei Song
07 Dec 2021
Journal of Artificial Intelligence and Technology | VOL. -

Morphing control of a new bionic morphing UAV with deep reinforcement learning
Dan Xu ... Gang Chen
Aerospace Science and Technology | VOL. 92
Dan Xu, et. al.Dan Xu ... Gang Chen
28 May 2019
Aerospace Science and Technology | VOL. 92

Robot grasping method optimization using improved deep deterministic policy gradient algorithm of deep reinforcement learning
Hongxu Zhang ... Jianhui Wang
The Review of scientific instruments | VOL. 92
Hongxu Zhang, et. al.Hongxu Zhang ... Jianhui Wang
01 Feb 2021
The Review of scientific instruments | VOL. 92

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Path Planning for UAV Ground Target Tracking via Deep Reinforcement Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions