쿼드로터 UAV 의 PPO 기반 모델 참조 추종 제어

Younghun Byeon,Han-Sol Kim

doi:10.5302/j.icros.2022.22.0143

Abstract

This paper presents a proximal policy optimization (PPO)-based model reference tracking controller design for quadrotor unmanned aerial vehicles (UAVs). First, the quadrotor UAV is divided into the attitude and position systems, and each system is expressed as a nonlinear state-space equation. Thereafter, we design a linear reference model, which is used to derive the tracking error dynamics. The proposed neural network model implements a PPO-based reinforcement learning algorithm consisting of an actor approximating a policy and a critic corresponding to a state-value function. The actor receives the state variables of the tracking error dynamics as input and outputs the thrust values to be applied to each axis. Further, we decentralize the PPO-based controller into the attitude and position controllers, which are then trained separately. For learning purposes, we implement an environment code that expresses the tracking error dynamics of the quadrotor by extending the OpenAI Gym environment. Finally, a simulation example is provided to show the position tracking performances of up to 0.0166 m and 0.0254 m for the horizontal and vertical axes, respectively.

Full Text