Cooperative Differential Game-Based Modular Unmanned System Approximate Optimal Control: An Adaptive Critic Design Approach
An approximate optimal control issue for modular unmanned systems (MUSs) is presented via a cooperative differential game for solving the trajectory tracking problem. Initially, the modular unmanned system’s dynamic model is built with the joint torque feedback technique. The moment of inertia of the motor rotor has positive symmetry. Each MUS module is deemed as a participant in the cooperative differential game. Then, the MUS trajectory tracking problem is transformed into an approximate optimal control problem by means of adaptive critic design (ACD). The approximate optimal control is obtained by the critic network, approaching the joint performance index function of the system. The stability of the closed-loop system is proved through Lyapunov theory. The feasibility of the proposed control algorithm is verified by an experimental platform.
- Conference Article
- 10.23919/acc45564.2020.9147921
- Jul 1, 2020
By exploiting min-plus linearity, semiconcavity, and semigroup properties of dynamic programming, a fundamental solution semigroup for a class of approximate finite horizon linear infinite dimensional optimal control problems is constructed. Elements of this fundamental solution semigroup are parameterized by the time horizon, and can be used to approximate the solution of the corresponding finite horizon optimal control problem for any terminal cost. They can also be composed to compute approximations on longer horizons. The value function approximation provided takes the form of a min-plus convolution of a kernel with the terminal cost. A general construction for this kernel is provided, along with a spectral representation for a restricted class of sub-problems.
- Research Article
9
- 10.1155/2014/752854
- Jan 1, 2014
- Abstract and Applied Analysis
We consider an optimal control problem subject to the terminal state equality constraint and continuous inequality constraints on the control and the state. By using the control parametrization method used in conjunction with a time scaling transform, the constrained optimal control problem is approximated by an optimal parameter selection problem with the terminal state equality constraint and continuous inequality constraints on the control and the state. On this basis, a simple exact penalty function method is used to transform the constrained optimal parameter selection problem into a sequence of approximate unconstrained optimal control problems. It is shown that, if the penalty parameter is sufficiently large, the locally optimal solutions of these approximate unconstrained optimal control problems converge to the solution of the original optimal control problem. Finally, numerical simulations on two examples demonstrate the effectiveness of the proposed method.
- Research Article
51
- 10.1080/00207179.2023.2250880
- Sep 2, 2023
- International Journal of Control
This paper studies the approximate optimal event-triggered tracking control problem for a class of second-order nonlinear systems with prescribed performances. By employing a prescribed performances function, restrictions on tracking errors are removed for prescribed performances including convergence rate, maximum overshoot, and steady-state error. On this basis, a complex optimal tracking control problem can be transformed into an optimal stabilising control problem by constructing an augmented system. Subsequently, an event-triggered mechanism is introduced to reduce communication resources. Then, an adaptive critic learning algorithm is used to solve the Hamilton-Jacobi-Bellman equation, where the weights in the critic networks are tuned through the gradient descent approach and experience replay technology. Through the developed learning method, the persistence of excitation condition is released, and the data efficiency is improved. The Lyapunov stability theory is introduced to verify the stability of the close-loop system and the tracking errors are uniformly ultimately bounded. Finally, simulation results are presented to show the effectiveness of the proposed approach.
- Conference Article
- 10.1109/icca.2009.5410360
- Dec 1, 2009
The boiler combustion process of power plant is a typical process with the features of multi-input, multi-output, strong non-linearity, strong jamming and close coupling. The coupling relationship between its parameters correlated with combustion process is very anfractuous, so it is very hard to solve its optimal control problem with conventional control methods. And Adaptive Critic Designs (ACDs) is a good way to deal with the approximate optimal control problems over time in complex nonlinear systems. But most of the ACD structures were designed by BP (Back Propagation) neural network, the controllers are easy to fall into local minimum, so usually the learning efficiency is very low even fail to training. In order to speed up the learning process of the controllers, this paper try to design an optimal controller based on Dual Heuristic Programming (DHP) by Generalized Radial Basis Function Neural Network (GRBFNN) and applied it to the simulation control of the boiler combustion process. The results indicate that the designed controller is effective.
- Research Article
1
- 10.1016/j.asr.2024.11.073
- Dec 3, 2024
- Advances in Space Research
Finite-horizon approximate optimal attitude control based on adaptive dynamic programming for ultra-low-orbit satellite
- Conference Article
- 10.1145/3379247.3379265
- Jan 4, 2020
In this paper, the approximate optimal control problem for nonlinear systems with mismatched perturbations is addressed through asymptotically stable critic neural network (NN). By employing the estimated perturbation via nonlinear perturbation observer, the online updated value function is constructed to reflect the real-time perturbations, regulation and control simultaneously. In order to solve the Hamilton-Jacobi-Bellman equation, an asymptotically stable critic NN is established based on the novel nested update laws. Thus, the approximate optimal control is obtained to guarantee the closed-loop system to be uniformly ultimately bounded based on the Lyapunov's direct method. Simulation results illustrate the effectiveness of the developed control scheme.
- Research Article
3
- 10.1016/j.ifacol.2017.08.1862
- Jul 1, 2017
- IFAC PapersOnLine
Approximate Optimal Control via Measurement Feedback for a Class of Nonlinear Systems
- Research Article
- 10.1177/09596518251383219
- Nov 11, 2025
- Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering
For a class of uncertain nonlinear systems with unknown disturbances and input constraints, a novel event-triggered adaptive dynamic programming control strategy based on recursive terminal sliding mode (ET-ADP-RTSM) is proposed. First, a recursive terminal sliding mode (RTSM) surface composed of a fast non-singular sliding mode surface and an integral sliding mode surface is constructed to ensure that the tracking error converges to zero with faster speed and higher accuracy. Second, the RTSM, the upper bound information of the disturbance function, and the constraint function of the control input are simultaneously incorporated into the utility function to construct an improved performance value function, transforming the robust nonlinear control problem of the system into an approximate optimal control problem. A nested update strategy is adopted when using the neural networks to approximate the optimal value function. An event-triggered (ET) constrained tracking Hamilton-Jacobi-Bellman (HJB) equation is established, and only one critic neural network (NN) is used to learn the optimal value function and obtain the optimal tracking controller. Finally, based on Lyapunov theory, the convergence of the critic NN weights and the stability of the entire closed-loop system are proved. Simulation results and comparative analyses verify the effectiveness of the proposed control strategy.
- Research Article
14
- 10.1109/tnnls.2021.3107550
- Jun 1, 2023
- IEEE Transactions on Neural Networks and Learning Systems
This article investigates the approximate optimal control problem for nonlinear affine systems under the periodic event triggered control (PETC) strategy. In terms of optimal control, a theoretical comparison of continuous control, traditional event-based control (ETC), and PETC from the perspective of stability convergence, concluding that PETC does not significantly affect the convergence rate than ETC. It is the first time to present PETC for optimal control target of nonlinear systems. A critic network is introduced to approximate the optimal value function based on the idea of reinforcement learning (RL). It is proven that the discrete updating time series from PETC can also be utilized to determine the updating time of the learning network. In this way, the gradient-based weight estimation for continuous systems is developed in discrete form. Then, the uniformly ultimately bounded (UUB) condition of controlled systems is analyzed to ensure the stability of the designed method. Finally, two illustrative examples are given to show the effectiveness of the method.
- Research Article
72
- 10.1109/tsmcb.2012.2194781
- May 10, 2012
- IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)
This paper addresses the approximate optimal control problem for a class of parabolic partial differential equation (PDE) systems with nonlinear spatial differential operators. An approximate optimal control design method is proposed on the basis of the empirical eigenfunctions (EEFs) and neural network (NN). First, based on the data collected from the PDE system, the Karhunen-Loève decomposition is used to compute the EEFs. With those EEFs, the PDE system is formulated as a high-order ordinary differential equation (ODE) system. To further reduce its dimension, the singular perturbation (SP) technique is employed to derive a reduced-order model (ROM), which can accurately describe the dominant dynamics of the PDE system. Second, the Hamilton-Jacobi-Bellman (HJB) method is applied to synthesize an optimal controller based on the ROM, where the closed-loop asymptotic stability of the high-order ODE system can be guaranteed by the SP theory. By dividing the optimal control law into two parts, the linear part is obtained by solving an algebraic Riccati equation, and a new type of HJB-like equation is derived for designing the nonlinear part. Third, a control update strategy based on successive approximation is proposed to solve the HJB-like equation, and its convergence is proved. Furthermore, an NN approach is used to approximate the cost function. Finally, we apply the developed approximate optimal control method to a diffusion-reaction process with a nonlinear spatial operator, and the simulation results illustrate its effectiveness.
- Conference Article
1
- 10.1109/ccdc.2019.8832761
- Jun 1, 2019
In this paper, an iterative approximate dynamic programming algorithm is proposed for solving the approximate optimal tracking control problem for a class of nonlinear discrete-time switched systems. Firstly, the optimal tracking problem is converted into designing an optimal regulator for the tracking error dynamics. And then, the iterative approximate dynamic programming algorithm is proposed to obtain the approximate solution of the Hamilton-Jacobi-Bellman (HJB) equation. Next, two neural networks are used to approximate the iterative cost function and the iterative control law, respectively. Finally, simulation results are given to verify the effectiveness of the proposed algorithm.
- Conference Article
2
- 10.1109/ccdc.2015.7162722
- May 1, 2015
This paper deals with the approximate optimal inventory control problem of the supply chain networks with lead time. By introducing a sensitivity parameter, the original optimal problem is transformed into a sequence of two-point boundary value (TPBV) problems without time-delay term. Then the uniqueness and the existence of the optimal control law is presented. Then by using a finite sum of the series, a suboptimal inventory replenishment strategy of supply chain networks system is proposed. Simulation examples show that the inventory replenishment strategy proposed in this paper is effective to reduce bullwhip and thereby improve the performance of supply chain networks system.
- Research Article
8
- 10.1017/s1446181110000040
- Oct 1, 2009
- The ANZIAM Journal
In this paper, an efficient computation method is developed for solving a general class of minmax optimal control problems, where the minimum deviation from the violation of the continuous state inequality constraints is maximized. The constraint transcription method is used to construct a smooth approximate function for each of the continuous state inequality constraints. We then obtain an approximate optimal control problem with the integral of the summation of these smooth approximate functions as its cost function. A necessary condition and a sufficient condition are derived showing the relationship between the original problem and the smooth approximate problem. We then construct a violation function from the solution of the smooth approximate optimal control problem and the original continuous state inequality constraints in such a way that the optimal control of the minmax problem is equivalent to the largest root of the violation function, and hence can be solved by the bisection search method. The control parametrization and a time scaling transform are applied to these optimal control problems. We then consider two practical problems: the obstacle avoidance optimal control problem and the abort landing of an aircraft in a windshear downburst.
- Book Chapter
8
- 10.1007/978-3-319-78384-0_4
- Jan 1, 2018
This chapter develops a data-driven implementation of model-based reinforcement learning to solve approximate optimal control problems online under a persistence of excitation-like rank condition. The development is based on the observation that, given a model of the system, reinforcement learning can be implemented by evaluating the Bellman error at any number of desired points in the state-space. In this result, a parametric system model is considered, and a data-driven parameter identifier is developed to compensate for uncertainty in the parameters. Uniformly ultimately bounded regulation of the system states to a neighborhood of the origin, and convergence of the developed policy to a neighborhood of the optimal policy are established using a Lyapunov-based analysis. Simulation results indicate that the developed controller can be implemented to achieve fast online learning without the addition of ad-hoc probing signals as in Chap. 3. The developed model-based reinforcement learning method is extended to solve trajectory tracking problems for uncertain nonlinear systems, and to generate approximate feedback-Nash equilibrium solutions to N-player nonzero-sum differential games.
- Research Article
7
- 10.1016/j.neunet.2024.106880
- Nov 6, 2024
- Neural Networks
Barrier-critic-disturbance approximate optimal control of nonzero-sum differential games for modular robot manipulators
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.