Abstract

This paper proposes a reinforcement learning (RL) based path following strategy for underactuated airships with magnitude and rate saturation. The Markov decision process (MDP) model for the control problem is established. Then an error bounded line-of-sight (LOS) guidance law is investigated to restrain the state space. Subsequently, a proximal policy optimization (PPO) algorithm is employed to approximate the optimal action policy through trial and error. Since the optimal action policy is generated from the action space, the magnitude and rate saturation can be avoided. The simulation results, involving circular, general, broken-line, and anti-wind path following tasks, demonstrate that the proposed control scheme can transfer to new tasks without adaptation, and possesses satisfying real-time performance and robustness.

Highlights

  • As a kind of lighter-than-air vehicles, airships have distinct advantages over other vehicles in ultra-long duration flight, low fuel consumption, making them a cost-effective platform for communication relay, monitoring, surveillance, and scientific exploration

  • Despite the tracking errors which are acceptable for the airship, the proposed controller has satisfactory effectiveness and robustness in the presence of actuators’ magnitude and rate saturation

  • A real-time path following controller has been presented for airships with actuator magnitude and rate saturation

Read more

Summary

Introduction

As a kind of lighter-than-air vehicles, airships have distinct advantages over other vehicles in ultra-long duration flight, low fuel consumption, making them a cost-effective platform for communication relay, monitoring, surveillance, and scientific exploration. In Reference [31], a decentralized state feedback controller based on the linear matrix inequality (LMI) conditions is adopted to a class of nonlinear interconnected systems subject to the magnitude and rate constraints. Motivated by the discussions mentioned above, the RL-based path following controller is proposed for underactuated airships subject to magnitude and rate saturation in this paper. LOS guidance law is presented to restrain the state space, while a PPO-based algorithm is employed to acquire the action policy through trial and error. Performed tasks are converted to a point-following task while the point is always distributed in a specified bounded space Exhaustive training in this space will generate optimal action policies. An RL-based path following controller is proposed for underactuated airships with magnitude and rate saturation.

Airship Model
Path Following Controller Design
Error Bounded LOS Guidance
MDP Model of the Airship
Optimization Process of PPO
Simulations
Controller Training and Comparing
Compared Method
Circular Path Following
General Path Following
Broken-line path following
Anti-Wind Path Following
Findings
Discussion
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call