Abstract

This paper presents a proximal policy optimization (PPO)-based model reference tracking controller design for quadrotor unmanned aerial vehicles (UAVs). First, the quadrotor UAV is divided into the attitude and position systems, and each system is expressed as a nonlinear state-space equation. Thereafter, we design a linear reference model, which is used to derive the tracking error dynamics. The proposed neural network model implements a PPO-based reinforcement learning algorithm consisting of an actor approximating a policy and a critic corresponding to a state-value function. The actor receives the state variables of the tracking error dynamics as input and outputs the thrust values to be applied to each axis. Further, we decentralize the PPO-based controller into the attitude and position controllers, which are then trained separately. For learning purposes, we implement an environment code that expresses the tracking error dynamics of the quadrotor by extending the OpenAI Gym environment. Finally, a simulation example is provided to show the position tracking performances of up to 0.0166 m and 0.0254 m for the horizontal and vertical axes, respectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call