Quadrotor motion control using deep reinforcement learning

Zifei Jiang,Alan F Lynch

doi:10.1139/juvs-2021-0010

Abstract

We present a deep neural-net-based controller trained by a model-free reinforcement learning (RL) algorithm to achieve hover stabilization for a quadrotor unmanned aerial vehicle (UAV). With RL, two neural nets are trained. One neural net is used as a stochastic controller, which gives the distribution of control inputs. The other maps the UAV state to a scalar, which estimates the reward of the controller. A proximal policy optimization (PPO) method, which is an actor–critic policy gradient approach, is used to train the neural nets. Simulation results show that the trained controller achieves a comparable level of performance to a manually tuned proportional-derivative (PD) controller, despite not depending on any model information. The paper considers different choices of reward function and their influence on controller performance.

Full Text