Abstract

In this paper, an effective off-policy algorithm is proposed to solve the continuous time nonzero-sum (NZS) control problem for unknown nonlinear systems with saturated actuator. A class of nonquadratic function is used to construct the performance functions to deal with constrained inputs. Utilizing the integral reinforcement learning (IRL) technique, the off-policy learning mechanism is introduced to design an iterative method for the continuous-time NZS constrained control problem without requiring the knowledge of system dynamics. To show the convergence of the proposed method, the traditional policy iteration (PI) method is discussed for the continuous-time NZS control problem with saturated actuator at first. Then, the equivalence of the proposed method with the traditional PI method is proved. Neural networks are introduced to construct the actor-critic structure, where the critic neural networks are aimed at approximating the iterative value functions and the actor neural networks are aimed at approximating the iterative control policies. Finally, two cases are simulated to verify the effectiveness of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call