Abstract
This paper proposes a reinforcement learning (RL) based path following strategy for underactuated airships with magnitude and rate saturation. The Markov decision process (MDP) model for the control problem is established. Then an error bounded line-of-sight (LOS) guidance law is investigated to restrain the state space. Subsequently, a proximal policy optimization (PPO) algorithm is employed to approximate the optimal action policy through trial and error. Since the optimal action policy is generated from the action space, the magnitude and rate saturation can be avoided. The simulation results, involving circular, general, broken-line, and anti-wind path following tasks, demonstrate that the proposed control scheme can transfer to new tasks without adaptation, and possesses satisfying real-time performance and robustness.
Highlights
As a kind of lighter-than-air vehicles, airships have distinct advantages over other vehicles in ultra-long duration flight, low fuel consumption, making them a cost-effective platform for communication relay, monitoring, surveillance, and scientific exploration
Despite the tracking errors which are acceptable for the airship, the proposed controller has satisfactory effectiveness and robustness in the presence of actuators’ magnitude and rate saturation
A real-time path following controller has been presented for airships with actuator magnitude and rate saturation
Summary
As a kind of lighter-than-air vehicles, airships have distinct advantages over other vehicles in ultra-long duration flight, low fuel consumption, making them a cost-effective platform for communication relay, monitoring, surveillance, and scientific exploration. In Reference [31], a decentralized state feedback controller based on the linear matrix inequality (LMI) conditions is adopted to a class of nonlinear interconnected systems subject to the magnitude and rate constraints. Motivated by the discussions mentioned above, the RL-based path following controller is proposed for underactuated airships subject to magnitude and rate saturation in this paper. LOS guidance law is presented to restrain the state space, while a PPO-based algorithm is employed to acquire the action policy through trial and error. Performed tasks are converted to a point-following task while the point is always distributed in a specified bounded space Exhaustive training in this space will generate optimal action policies. An RL-based path following controller is proposed for underactuated airships with magnitude and rate saturation.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have