HJB-RL: Initializing Reinforcement Learning with Optimal Control Policies Applied to Autonomous Drone Racing

Keiko Nagami,Mac Schwager

doi:10.15607/rss.2021.xvii.062

Abstract

In this work we present a planning and control method for a quadrotor in an autonomous drone race. Our method combines the advantages of both model-based optimal control and model-free deep reinforcement learning. We consider a single drone racing on a track marked by a series of gates, through which it must maneuver in minimum time. Firstly we solve the discretized Hamilton-Jacobi-Bellman (HJB) equation to produce a closed-loop policy for a simplified, reduced order model of the drone. Next, we train a deep network policy in a supervised fashion to mimic the HJB policy. Finally, we further train this network using policy gradient reinforcement learning on the full drone dynamics model with a low-level feedback controller in the loop. This gives a deep network policy for controlling the drone to pass through a single gate. In a race course, this policy is applied successively to each new oncoming gate to guide the drone through the course. The resulting policy completes a high-fidelity AirSim drone race with 12 gates in 34.89s (on average), outracing a model-based HJB policy by 33.20s, a supervised learning policy by 1.24s, and a trajectory planning policy by 12.99s, while a model-free RL policy was never able to complete the race.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

HJB-RL: Initializing Reinforcement Learning with Optimal Control Policies Applied to Autonomous Drone Racing

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Visual Navigation and Optimal Control for Autonomous Drone Racing

-

03 Nov 2020
03 Nov 2020

Perception Action Aware-Based Autonomous Drone Race in a Photorealistic Environment
Muhammad Kazim ... Adeel Zaidi
IEEE Access | VOL. 10
Muhammad Kazim, et. al.Muhammad Kazim ... Adeel Zaidi
01 Jan 2021
IEEE Access | VOL. 10

Formula-E race strategy development using distributed policy gradient reinforcement learning
Xuze Liu ... Daniel J Auger
Knowledge-Based Systems | VOL. 216
Xuze Liu, et. al.Xuze Liu ... Daniel J Auger
20 Jan 2021
Knowledge-Based Systems | VOL. 216

Visual attention prediction improves performance of autonomous drone racing agents.
Christian Pfeiffer ... Davide Scaramuzza
PLOS ONE | VOL. 17
Christian Pfeiffer, et. al.Christian Pfeiffer ... Davide Scaramuzza
01 Mar 2022
PLOS ONE | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

HJB-RL: Initializing Reinforcement Learning with Optimal Control Policies Applied to Autonomous Drone Racing

Abstract

Talk to us

Similar Papers