Abstract

Classical gradient-based approximate dynamic programming approaches provide reliable and fast solution platforms for various optimal control problems. However, their dependence on accurate modeling approaches poses a major concern, where the efficiency of the proposed solutions are severely degraded in the case of uncertain dynamical environments. Herein, a novel online adaptive learning framework is introduced to solve action-dependent dual heuristic dynamic programming problems. The approach does not depend on the dynamical models of the considered systems. Instead, it employs optimization principles to produce model-free control strategies. A policy iteration process is employed to solve the underlying Hamilton–Jacobi–Bellman equation using means of adaptive critics, where a layer of separate actor-critic neural networks is employed along with gradient descent adaptation rules. A Riccati development is introduced and shown to be equivalent to solving the underlying Hamilton–Jacobi–Bellman equation. The proposed approach is applied on the challenging weight shift control problem of a flexible wing aircraft. The continuous nonlinear deformation in the aircraft’s flexible wing leads to various aerodynamic variations at different trim speeds, which makes its auto-pilot control a complicated task. Series of numerical simulations were carried out to demonstrate the effectiveness of the suggested strategy.

Highlights

  • Various Approximate Dynamic Programming (ADP) methods have been employed to solve the optimal control problems for single and multi-agent systems [1,2,3,4,5,6]

  • The approach uses model-free control structures and gradient-based solving value functions. This serves as a model-free solution framework for the classical Action Dependent Dual Heuristic Dynamic Programming problems

  • These results conclude the duality between the Hamiltonian function and Bellman equation for the Action Dependent Dual Heuristic Dynamic Programming solutions

Read more

Summary

Introduction

Various Approximate Dynamic Programming (ADP) methods have been employed to solve the optimal control problems for single and multi-agent systems [1,2,3,4,5,6]. An online adaptive learning approach, based on a gradient structure, is employed to solve the challenging control problem of flexible wing aircrafts. Reinforcement Learning approaches use various forms of temporal difference equations to solve the optimization problems associated with the dynamical systems [1,18] This implies finding ways to penalize or reward the attempted control strategies to optimize a certain objective function. Adaptive critics provide prominent solution frameworks for the adaptive dynamic programming problems [31] They are employed to produce expert paradigms that can undergo learning processes while solving the underlying optimization challenges. The approach uses model-free control structures and gradient-based solving value functions This serves as a model-free solution framework for the classical Action Dependent Dual Heuristic Dynamic Programming problems.

Control Mechanism of a Flexible Wing Aircraft
Bellman Equation Formulation
Model-Based Policy Formulation
Model-Free Policy Formulation
Hamiltonian-Jacobi–Bellman Formulation
The Hamiltonian Mechanics
Hamiltonian–Bellman Solutions Duality
The Adaptive Learning Solution and Riccati Development
Model-Free Gradient-Based Solution
Riccati Development
Adaptive Critics Implementations
Actor-Critic Neural Networks Implementation
Simulation Results
Simulation Parameters
Simulation Case I
Simulation Case II
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.