This work presents a new framework for model-based continuous-time reinforcement learning (CT-RL) control of hypersonic vehicles (HSVs). The predominant classes of CT-RL methods for general nonlinear systems in adaptive dynamic programming (ADP) and deep RL tend to either present substantial theoretical results but lack practical synthesis capability (ADP) or show empirical promise without offering theoretical guarantees (deep RL). Meanwhile, RL control frameworks developed directly for HSVs tend to require a simplified model and complicated control structure, and they lack the substantial numerical evaluations essential for real-world flight implementation. To directly address these challenges, we propose a new decentralized excitable integral reinforcement learning framework within which the reference input-based exploration improves persistence of excitation. Together with new insights on prescaling and established decentralized control structure for HSVs, we demonstrate the resulting controller for significant performance improvement over classical Linear Quadratic Regulator (LQR) and feedback linearization methods. Additionally, we provide convergence, optimality, and closed-loop stability guarantees of the proposed method. We demonstrate these performance guarantees over a set of substantial and systematic numerical evaluations on an unstable, nonminimum phase HSV model subject to varying modeling errors and initial conditions.
Read full abstract