Improve exploration in deep reinforcement learning for UAV path planning using state and action entropy

Hui Lv,Yadong Chen,Shibo Li,Baolong Zhu,Min Li

doi:10.1088/1361-6501/ad2663

Abstract

Despite being a widely adopted development framework for unmanned aerial vehicle (UAV), deep reinforcement learning is often considered sample inefficient. Particularly, UAV struggles to fully explore the state and action space in environments with sparse rewards. While some exploration algorithms have been proposed to overcome the challenge of sparse rewards, they are not specifically tailored for UAV platform. Consequently, applying those algorithms to UAV path planning may lead to problems such as unstable training processes and neglect of action space comprehension, possibly causing negative impacts on the path planning results. To address the problem of sparse rewards in UAV path planning, we propose an information-theoretic exploration algorithm, Entropy Explorer (EE), specifically for UAV platform. The proposed EE generates intrinsic rewards based on state entropy and action entropy to compensate for the scarcity of extrinsic rewards. To further improve sampling efficiency, a framework integrating EE and Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithms is proposed. Finally, the TD3-EE algorithm is tested in AirSim and compared against benchmarking algorithms. The simulation outcomes manifest that TD3-EE effectively stimulates the UAV to comprehensively explore both state and action spaces, thereby attaining superior performance compared to the benchmark algorithms in the realm of path planning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Measurement Science and Technology	Publication Date: Feb 22, 2024
Citations: 3	License type: iop-standard

R Discovery Prime

R Discovery Prime

Improve exploration in deep reinforcement learning for UAV path planning using state and action entropy

Abstract

Talk to us

Similar Papers

More From: Measurement Science and Technology

Lead the way for us

Similar Papers

Travelling salesman problem for UAV path planning with two parallel optimization algorithms
Jie Chen ... Fang Ye
-
Jie Chen, et. al.Jie Chen ... Fang Ye
01 Nov 2017
01 Nov 2017

Chapter 4 - Path planning and task assignment for multiple UAVs in dynamic environments
Sumana Biswas ... Matthew A Garratt
Unmanned Aerial Systems | VOL. -
Sumana Biswas, et. al.Sumana Biswas ... Matthew A Garratt
01 Jan 2020
Unmanned Aerial Systems | VOL. -

Data Collection Mechanism for UAV-Assisted Cellular Network Based on PPO
Tuo Chen ... Bin Wu
Electronics | VOL. 12
Tuo Chen, et. al.Tuo Chen ... Bin Wu
13 Mar 2023
Electronics | VOL. 12

Path planning in unmanned aerial vehicles: An optimistic overview
Noor Shahid ... Roha Masroor
International Journal of Communication Systems | VOL. 35
Noor Shahid, et. al.Noor Shahid ... Roha Masroor
18 Jan 2022
International Journal of Communication Systems | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improve exploration in deep reinforcement learning for UAV path planning using state and action entropy

Abstract

Talk to us

Similar Papers

More From: Measurement Science and Technology