A Supervised Reinforcement Learning Algorithm for Controlling Drone Hovering

Jiying Wu,Zhong Yang,Changliang Xu,Naifeng He,Zhiyong Wang,Luwei Liao,Chi Zhang,Haoze Zhuo

doi:10.3390/drones8030069

Abstract

The application of drones carrying different devices for aerial hovering operations is becoming increasingly widespread, but currently there is very little research relying on reinforcement learning methods for hovering control, and it has not been implemented on physical machines. Drone’s behavior space regarding hover control is continuous and large-scale, making it difficult for basic algorithms and value-based reinforcement learning (RL) algorithms to have good results. In response to this issue, this article applies a watcher-actor-critic (WAC) algorithm to the drone’s hover control, which can quickly lock the exploration direction and achieve high robustness of the drone’s hover control while improving learning efficiency and reducing learning costs. This article first utilizes the actor-critic algorithm based on behavioral value Q (QAC) and the deep deterministic policy gradient algorithm (DDPG) for drone hover control learning. Subsequently, an actor-critic algorithm with an added watcher is proposed, in which the watcher uses a PID controller with parameters provided by a neural network as the dynamic monitor, transforming the learning process into supervised learning. Finally, this article uses a classic reinforcement learning environment library, Gym, and a current mainstream reinforcement learning framework, PARL, for simulation, and deploys the algorithm to a practical environment. A multi-sensor fusion strategy-based autonomous localization method for unmanned aerial vehicles is used for practical exercises. The simulation and experimental results show that the training episodes of WAC are reduced by 20% compared to the DDPG and 55% compared to the QAC, and the proposed algorithm has a higher learning efficiency, faster convergence speed, and smoother hovering effect compared to the QAC and DDPG.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Drones	Publication Date: Feb 20, 2024
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Supervised Reinforcement Learning Algorithm for Controlling Drone Hovering

Abstract

Talk to us

Similar Papers

More From: Drones

Lead the way for us

Similar Papers

UAV maneuvering decision -making algorithm based on Twin Delayed Deep Deterministic Policy Gradient Algorithm
Shuangxia Bai ... Evgeny Neretin
Journal of Artificial Intelligence and Technology | VOL. -
Shuangxia Bai, et. al.Shuangxia Bai ... Evgeny Neretin
07 Dec 2021
Journal of Artificial Intelligence and Technology | VOL. -

Deep Deterministic Policy Gradient-based Parameter Selection Method of Notch Filters for Suppressing Mechanical Resonance in Industrial Servo Systems
Tae-Ho Oh ... Sang-Oh Kim
-
Tae-Ho Oh, et. al.Tae-Ho Oh ... Sang-Oh Kim
01 Aug 2019
01 Aug 2019

Stability Analysis for Autonomous Vehicle Navigation Trained over Deep Deterministic Policy Gradient
Mireya Cabezas-Olivenza ... Ekaitz Zulueta
Mathematics | VOL. 11
Mireya Cabezas-Olivenza, et. al.Mireya Cabezas-Olivenza ... Ekaitz Zulueta
27 Dec 2022
Mathematics | VOL. 11

A DDPG Algorithm Based Reinforcement Learning Controller for Three-Phase DC-AC Inverters
Jian Ye ... Xinan Zhang
-
Jian Ye, et. al.Jian Ye ... Xinan Zhang
24 Feb 2023
24 Feb 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Supervised Reinforcement Learning Algorithm for Controlling Drone Hovering

Abstract

Talk to us

Similar Papers

More From: Drones