Approximate/adaptive Dynamic Programming Research Articles

We are motivated by the real challenges presented in a human-robot system to develop new designs that are efficient at data level and with performance guarantees, such as stability and optimality at system level. Existing approximate/adaptive dynamic programming (ADP) results that consider system performance theoretically are not readily providing practically useful learning control algorithms for this problem, and reinforcement learning (RL) algorithms that address the issue of data efficiency usually do not have performance guarantees for the controlled system. This study fills these important voids by introducing innovative features to the policy iteration algorithm. We introduce flexible policy iteration (FPI), which can flexibly and organically integrate experience replay and supplemental values from prior experience into the RL controller. We show system-level performances, including convergence of the approximate value function, (sub)optimality of the solution, and stability of the system. We demonstrate the effectiveness of the FPI via realistic simulations of the human-robot system. It is noted that the problem we face in this study may be difficult to address by design methods based on classical control theory as it is nearly impossible to obtain a customized mathematical model of a human-robot system either online or offline. The results we have obtained also indicate the great potential of RL control to solving realistic and challenging problems with high-dimensional control inputs.

Read full abstract

Value iteration-based approximate/adaptive dynamic programming (ADP) as an approximate solution to infinite-horizon optimal control problems with deterministic dynamics and continuous state and action spaces is investigated. The learning iterations are decomposed into an outer loop and an inner loop. A relatively simple proof for the convergence of the outer-loop iterations to the optimal solution is provided using a novel idea with some new features. It presents an analogy between the value function during the iterations and the value function of a fixed-final-time optimal control problem. The inner loop is utilized to avoid the need for solving a set of nonlinear equations or a nonlinear optimization problem numerically, at each iteration of ADP for the policy update. Sufficient conditions for the uniqueness of the solution to the policy update equation and for the convergence of the inner-loop iterations to the solution are obtained. Afterwards, the results are formed as a learning algorithm for training a neurocontroller or creating a look-up table to be used for optimal control of nonlinear systems with different initial conditions. Finally, some of the features of the investigated method are numerically analyzed.

Read full abstract

Approximate/adaptive Dynamic Programming Research Articles

Related Topics

Articles published on Approximate/adaptive Dynamic Programming

Concurrent learning for adaptive pontryagin's maximum principle of nonlinear systems with inequality constraints

Reinforcement Learning Control of Robotic Knee With Human-in-the-Loop by Flexible Policy Iteration.

Online [formula omitted] control for continuous-time nonlinear large-scale systems via single echo state network

Adaptive Reinforcement Learning Strategy with Sliding Mode Control for Unknown and Disturbed Wheeled Inverted Pendulum

Pareto optimal control of the mean-field stochastic systems by adaptive dynamic programming algorithm

An improvement of single-network adaptive critic design for nonlinear systems with asymmetry constraints

Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming

Revisiting approximate dynamic programming and its convergence.

Learning Control of Dynamical Systems Based on Markov Decision Processes: Research Frontiers and Outlooks

A boundedness result for the direct heuristic dynamic programming

Approximate Dynamic Programming for Optimal Stationary Control With Control-Dependent Noise

Adaptive dynamic programming for online solution of a zero-sum differential game

Adaptive Dynamic Programming: An Introduction

Issues on Stability of ADP Feedback Controllers for Dynamical Systems

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Approximate/adaptive Dynamic Programming Research Articles

Related Topics

Articles published on Approximate/adaptive Dynamic Programming

Concurrent learning for adaptive pontryagin's maximum principle of nonlinear systems with inequality constraints

Reinforcement Learning Control of Robotic Knee With Human-in-the-Loop by Flexible Policy Iteration.

Online [formula omitted] control for continuous-time nonlinear large-scale systems via single echo state network

Adaptive Reinforcement Learning Strategy with Sliding Mode Control for Unknown and Disturbed Wheeled Inverted Pendulum

Pareto optimal control of the mean-field stochastic systems by adaptive dynamic programming algorithm

An improvement of single-network adaptive critic design for nonlinear systems with asymmetry constraints

Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming

Revisiting approximate dynamic programming and its convergence.

Learning Control of Dynamical Systems Based on Markov Decision Processes: Research Frontiers and Outlooks

A boundedness result for the direct heuristic dynamic programming

Approximate Dynamic Programming for Optimal Stationary Control With Control-Dependent Noise

Adaptive dynamic programming for online solution of a zero-sum differential game

Adaptive Dynamic Programming: An Introduction

Issues on Stability of ADP Feedback Controllers for Dynamical Systems