Abstract

In this paper, methods are presented for designing a quadrotor attitude control system with disturbance rejection ability, wherein only one parameter needs to be tuned for each axis. The core difference between quadrotor platforms are extracted as critical gain parameters (CGPs). Reinforcement learning (RL) technology is introduced in order to automatically optimize the controlling law for quadrotors with different CGPs, and the CGPs are used to extend the RL state list. A deterministic policy gradient (DPG) algorithm that is based on an actor-critic structure in a model-free style is used as the learning algorithm. Mirror sampling and reward shaping methods are designed in order to eliminate the steady-state errors of the RL controller and accelerate the training process. Active disturbance rejection control (ADRC) is applied to reject unknown external disturbances. A set of extended state observers (ESOs) is designed to estimate the total disturbance to the roll and pitch axes. The covariance matrix adaptation evolution strategy (CMA-ES) algorithm is used to automatically tune the ESO parameters and improve the final performance. The complete controller is tested on an F550 quadrotor in both simulation and real flight environments. The quadrotor can hover and move around stably and accurately in the air, even with a severe disturbance.

Highlights

  • Quadrotor unmanned aerial vehicles (UAVs) have shown great potential for both civil and military applications, because of low cost, small volumes, simple structures, flexible operation, and vertical take-off and landing capability [1,2]

  • The critical gain parameters (CGPs) is the estimation of the system response capability, and the accuracy of CGP is critical to the performance of both the extended state observers (ESOs) and state error feedback (SEF)

  • We present ESO parameter tuning with the covariance matrix adaptation evolution strategy (CMA-ES) algorithm

Read more

Summary

Introduction

Quadrotor unmanned aerial vehicles (UAVs) have shown great potential for both civil and military applications, because of low cost, small volumes, simple structures, flexible operation, and vertical take-off and landing capability [1,2]. Computer-aided design (CAD) has been used to obtain the parameters of a quadrupedal robot in [22], and noise and disturbances can be added to either the system models or sensor data during the training process We tested this method on a quadrotor in our previous study, and the trained policy was effective for platforms with different system parameters in a simulation environment. A new method is presented in order to achieve simple tuning control with RL: that is, we extract the core difference of platforms as critical system parameters that are included in the state list during the RL training process. RL is used to develop a single-parameter-tuned quadrotor attitude control system that exhibits excellent performance in both simulation and real flight experiments This method offers the considerable advantage of requiring one parameter CGP to be tuned for each axis.

System Model and Identification
Quadrotor Attitude Control System Design
First-Order ESO and SEF for Quadrotor Attitude Control
Estimation of the CGPs
Single-Parameter-Tuned Attitude Control Method
Network Structures
Reward Function Shaping and Mirror Sampling Methods
Training Algorithm
ESO Parameter Tuning with CMA-ES Algorithm
Sampling
Calculating Costs
Updating the Mean Value
Adapting the Covariance Matrix
Step-Size Control
Training Environment
Quadrotor Details
Performance of the RL Controller and Training Details
CMA-ES Training Detail in the Simulation Environment
Performance of the Full Controller
Experiments
CMA-ES Training Result and Detail on Real Platform
Outdoor Flight Experiment Result
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call