Abstract

The paper develops the adaptive dynamic programming toolbox (ADPT), which is a MATLAB-based software package and computationally solves optimal control problems for continuous-time control-affine systems. The ADPT produces approximate optimal feedback controls by employing the adaptive dynamic programming technique and solving the Hamilton–Jacobi–Bellman equation approximately. A novel implementation method is derived to optimize the memory consumption by the ADPT throughout its execution. The ADPT supports two working modes: model-based mode and model-free mode. In the former mode, the ADPT computes optimal feedback controls provided the system dynamics. In the latter mode, optimal feedback controls are generated from the measurements of system trajectories, without the requirement of knowledge of the system model. Multiple setting options are provided in the ADPT, such that various customized circumstances can be accommodated. Compared to other popular software toolboxes for optimal control, the ADPT features computational precision and time efficiency, which is illustrated with its applications to a highly non-linear satellite attitude control problem.

Highlights

  • Adaptive Dynamic ProgrammingOptimal control is an important branch in control engineering

  • Under the assumption that the optimal control and the optimal cost function can be represented in Taylor series, by plugging the series expansions of the dynamics, the cost integrand function, the optimal control and the optimal cost function into the HJB equation and collecting terms degree by degree, the Taylor expansions of the optimal control and the optimal cost function can be recursively obtained

  • By employing the adaptive dynamic programming technique, we propose a computational methodology to approximately produce the optimal control and the optimal cost function, where the Kronecker product used in previous literature is replaced by Euclidean inner product for less memory consumption at runtime

Read more

Summary

Introduction

Adaptive Dynamic ProgrammingOptimal control is an important branch in control engineering. For continuoustime dynamical systems, finding an optimal feedback control involves solving the socalled Hamilton–Jacobi–Bellman (HJB) equation [1]. For non-linear systems, solving the HJB equation is generally a formidable task due to its inherently non-linear nature. Al’brekht proposed a power series method for smooth systems to solve the HJB equation [3]. A recursive algorithm is developed to sequentially improve the control law which converges to the optimal one by starting with an admissible control [6]. This recursive algorithm is commonly referred to as policy iteration (PI) and can be found in [7,8,9]. The common limitation of these methods is that the complete knowledge of the system is required

Objectives
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call