Near optimal output feedback control of nonlinear discrete-time systems based on reinforcement neural network learning

Qiming Zhao,Hao Xu,Sarangapani Jagannathan

doi:10.1109/jas.2014.7004665

Abstract

In this paper, the output feedback based finite-horizon near optimal regulation of nonlinear affine discrete-time systems with unknown system dynamics is considered by using neural networks (NNs) to approximate Hamilton-Jacobi-Bellman (HJB) equation solution. First, a NN-based Luenberger observer is proposed to reconstruct both the system states and the control coefficient matrix. Next, reinforcement learning methodology with actor-critic structure is utilized to approximate the time-varying solution, referred to as the value function, of the HJB equation by using a NN. To properly satisfy the terminal constraint, a new error term is defined and incorporated in the NN update law so that the terminal constraint error is also minimized over time. The NN with constant weights and time-dependent activation function is employed to approximate the time-varying value function which is subsequently utilized to generate the finite-horizon near optimal control policy due to NN reconstruction errors. The proposed scheme functions in a forward-in-time manner without offline training phase. Lyapunov analysis is used to investigate the stability of the overall closed-loop system. Simulation results are given to show the effectiveness and feasibility of the proposed method.

Full Text